Deep Learning Weekly: Issue 426
Introducing Gemini Enterprise, A small number of samples can poison LLMs of any size, a paper on Self-Adapting Language Models, and many more!
This week in deep learning, we bring you Introducing Gemini Enterprise, A small number of samples can poison LLMs of any size, and a paper on Self-Adapting Language Models.
You may also enjoy Microsoft AI’s MAI-Image-1, Agents 2.0: From Shallow Loops to Deep Agents, a paper on Making, not Taking, the Best of N, and more!
As always, happy reading and hacking. If you have something you think should be in next week’s issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
Google introduced Gemini Enterprise, a complete, AI-optimized platform that includes a no-code workbench, a centralized government framework, as well as various integrations to existing business applications.
Introducing MAI-Image-1, debuting in the top 10 on LMArena
Microsoft AI announced MAI-Image-1, their first image generation model developed entirely in-house, debuting in the top 10 text-to-image models on LMArena.
Salesforce announces Agentforce 360 as enterprise AI competition heats up
Salesforce announced the latest version of Agentforce 360, which includes new ways to instruct, build, and deploy AI agents.
Kernel raises $22M to power browser infrastructure for AI agents
Kernel has raised $22 million in funding to scale its platform so AI agents can reliably navigate, persist, and use the web.
MLOps & LLMOps
Agents 2.0: From Shallow Loops to Deep Agents
An architectural post about the shift from “Shallow Agents” to “Deep Agents” that utilize explicit planning, sub-agents, and persistent memory to solve complex, multi-step problems.
Rearchitecting Letta’s Agent Loop: Lessons from ReAct, MemGPT, & Claude Code
A technical post detailing the rearchitecture of Letta’s agent loop, transitioning from older models like MemGPT to a V1 design leveraging modern LLM capabilities such as native reasoning.
Securing your agents with authentication and authorization
An article on securing agents by implementing authentication and authorization (AuthN/AuthZ), addressing their dynamic access needs.
Learning
A small number of samples can poison LLMs of any size \ Anthropic
An article about data-poisoning attacks shows that as few as 250 malicious documents can backdoor LLMs of any size, challenging the assumption that attackers need to control a percentage of training data.
A strategic blog post analyzing the high costs and risks of upgrading vector embedding models at scale, offering a decision framework that balances cutting-edge performance with stability and operational constraints.
Towards a Typology of Strange LLM Chains-of-Thought
A post outlining six causes for why LLMs trained with RLVR develop strange Chains-of-Thought, including hypotheses like Spandrels and Context Refresh.
SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips
A research blog post introducing SuperOffload, which leverages Superchip architectures like NVIDIA GH200 to boost LLM training throughput up to 4x higher than existing offloading solutions.
Libraries & Code
An open-source LLM evaluation tool used to debug, evaluate, and monitor LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Sparsify transformers with SAEs and transcoders
Papers & Publications
Abstract:
Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a self-edit-a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates. Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation. To train the model to produce effective self-edits, we use a reinforcement learning loop with the downstream performance of the updated model as the reward signal. Unlike prior approaches that rely on separate adaptation modules or auxiliary networks, SEAL directly uses the model’s own generation to control its adaptation process. Experiments on knowledge incorporation and few-shot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation.
Making, not Taking, the Best of N
Abstract:
Obtaining high-quality generations in modern LLMs has largely been framed as a selection problem: identifying a single winning generation from a diverse pool of N samples, the Best-of-N (BoN). Yet, this approach is inherently zero-sum, discarding diverse and potentially useful information from the pool. Instead, we explore a collaborative setup, where all candidates can potentially contribute to the final winning generation. To this end, we propose Fusion-of-N (FusioN): a method that uses a general LLM judge to synthesize the most informative elements of each sample into a single final answer. We compare FusioN to BoN in two settings, (i) test-time scaling, where we sample and aggregate from a single model at test-time (ii) synthetic data generation, where we fuse samples from a pool of diverse teachers to improve a student model. We extensively benchmark both setups across 11 languages, 3 diverse tasks and varying model scales. Across the bench, FusioN consistently outperforms BoN showing versatility and robustness both in test-time scaling and in downstream gains from synthetic data generation. We also perform extensive analysis on FusioN, where it shows surprising strengths and robustness under challenging settings. These results show that we should shift how we think about evaluating and utilizing LLM generations from a monolithic measure of quality, to embracing their polylithic nature. This shift allows us to integrate diverse strengths, unlock latent potential, and achieve improvements that were previously inaccessible through selection alone.