Deep Learning Weekly: Issue 403

Qwen3: Think Deeper, Act Faster, What Every AI Engineer Should Know About A2A, MCP & ACP, a paper on FlowReasoner: Reinforcing Query-Level Meta-Agents, and many more!

May 07, 2025

This week in deep learning, we bring you Qwen3: Think Deeper, Act Faster, What Every AI Engineer Should Know About A2A, MCP & ACP, and a paper on FlowReasoner: Reinforcing Query-Level Meta-Agents.

You may also enjoy One year of Phi: Small language models making big leaps in AI, Zero to One: Learning Agentic Patterns, a paper on UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer, and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!

Industry

Qwen3: Think Deeper, Act Faster

The Qwen team announced the release of Qwen3, the latest addition to the Qwen family of large language models.

One year of Phi: Small language models making big leaps in AI

Microsoft introduced Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning—marking a new era for small language models.

Hybrid AI model crafts smooth, high-quality videos in seconds

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Adobe Research have now developed a hybrid approach, called “CausVid,” to create videos in seconds.

FutureHouse Platform brings super-intelligent AI research tools to scientists via web and API

FutureHouse, a nonprofit building AI agents for scientific research, announced the launch of FutureHouse Platform, a web-based and API-accessible suite of AI agents designed to accelerate scientific discovery.

Parloa raises $120M at $1B valuation to expand enterprise AI agent platform

Parloa, a startup focused on customer experience agents, announced that it has raised $120 million in new funding to accelerate its expansion across North America and Europe.

MLOps & LLMOps

What Every AI Engineer Should Know About A2A, MCP & ACP

An article on the functionalities, implementation characteristics, and use cases of A2A, MCP & ACP

Secrets of Agentic UX: Emerging Design Patterns for Human Interaction with AI Agents

An article that examines how UX designers can effectively work with AI agents by understanding the four key capability types that shape agent behavior and user interaction.

Zero to One: Learning Agentic Patterns

A post that dives into several common agentic patterns, differentiating between more structured workflows and more dynamic agentic patterns.

Building News Agents for Daily News Recaps with MCP, Q, and tmux

A practical article about building a multi-agent system using MCP, Q, and tmux for daily news recaps by coordinating sub-agents to process various news feeds

Learning

Sycophancy and the art of the model

An insightful article exploring the GPT-4o sycophancy episode, its connection to RLHF and preference tuning challenges, and broader implications for model training, evaluation, and industry transparency

AMIE gains vision: A research AI agent for multimodal diagnostic dialogue

Google shares a first-of-its-kind demonstration of a multimodal conversational diagnostic AI agent called AMIE.

Libraries & Code

microsoft/bitnet

Official inference framework for 1-bit LLMs

yuanze-lin/Olympus

Olympus: A Universal Task Router for Computer Vision Tasks

Papers & Publications

FlowReasoner: Reinforcing Query-Level Meta-Agents

Abstract:

This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback. A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by 10.52% accuracy across three benchmarks.

UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer

Abstract:

This report presents UniAnimate-DiT, an advanced project that leverages the cutting-edge and powerful capabilities of the open-source Wan2.1 model for consistent human image animation. Specifically, to preserve the robust generative capabilities of the original Wan2.1 model, we implement Low-Rank Adaptation (LoRA) technique to fine-tune a minimal set of parameters, significantly reducing training memory overhead. A lightweight pose encoder consisting of multiple stacked 3D convolutional layers is designed to encode motion information of driving poses. Furthermore, we adopt a simple concatenation operation to integrate the reference appearance into the model and incorporate the pose information of the reference image for enhanced pose alignment. Experimental results show that our approach achieves visually appearing and temporally consistent high-fidelity animations. Trained on 480p (832x480) videos, UniAnimate-DiT demonstrates strong generalization capabilities to seamlessly upscale to 720P (1280x720) during inference.

A guest post by

Miko Planas

~~~

Deep Learning Weekly