Deep Learning Weekly: Issue 410
ChatGPT's Impact On Our Brains, Midjourney's first image-to-video model, a paper on Self-Adapting Language Models, and many more!
This week in deep learning, we bring you ChatGPT's Impact On Our Brains, Midjourney's first image-to-video model, and a paper on Self-Adapting Language Models.
You may also enjoy Beware General Claims about “Generalizable Reasoning Capabilities”, a paper on Identities are not Interchangeable: The Problem of Overgeneralization in Fair Machine Learning, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
ChatGPT's Impact On Our Brains According to an MIT Study
A new study from researchers at MIT’s Media Lab has returned some concerning results regarding the impact of ChatGPT on critical thinking and learning skills.
Midjourney launches its first AI video generation model, V1
Midjourney announced the launch of its image-to-video model, V1, in which users can upload an image and it will produce a set of four five-second videos based on it.
A federal judge ruled Anthropic's AI training on legally purchased books is "fair use", but ordered a trial over claims it used thousands of pirated copies from the internet.
MLOps & LLMOps
The Art of Scaling a Vector Database
A blog post on the art of scaling a vector database like Weaviate, explaining horizontal (sharding, replication) and vertical scaling methods for performance, resilience, and high availability in AI applications.
Why Your Vibe Coding Generates Outdated Code and How to Fix It with Milvus MCP
A blog post on addressing AI assistants generating outdated code due to stale training data.
Build an agentic multimodal AI assistant with Amazon Nova and Amazon Bedrock Data Automation
A comprehensive blog post on how to build an agentic multimodal AI assistant, leveraging a "Reason-Act-Observe-Loop" workflow and various AWS services.
Learning
Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)
An analytical blog post offering a critical perspective on claims about LLMs' fundamental limitations in generalizable reasoning, arguing that critiques often overlook mundane explanations for failures in toy settings.
Agentic Misalignment: How LLMs could be insider threats
An article about stress-testing LLMs in simulated corporate environments, revealing agentic misalignment where models engaged in malicious insider behaviors.
The limits of prediction: from the Oracle of Delphi to artificial intelligence
An article exploring the limits of prediction from ancient oracles to modern AI, outlining five conditions for successful forecasting and more.
Libraries & Code
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
An open-source LLM evaluation tool used to debug, evaluate, monitor LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Papers & Publications
Abstract:
Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a self-edit-a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates. Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation. To train the model to produce effective self-edits, we use a reinforcement learning loop with the downstream performance of the updated model as the reward signal. Unlike prior approaches that rely on separate adaptation modules or auxiliary networks, SEAL directly uses the model's own generation to control its adaptation process. Experiments on knowledge incorporation and few-shot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation.
Identities are not Interchangeable: The Problem of Overgeneralization in Fair Machine Learning
Abstract:
A key value proposition of machine learning is generalizability: the same methods and model architecture should be able to work across different domains and different contexts. While powerful, this generalization can sometimes go too far, and miss the importance of the specifics. In this work, we look at how fair machine learning has often treated as interchangeable the identity axis along which discrimination occurs. In other words, racism is measured and mitigated the same way as sexism, as ableism, as ageism. Disciplines outside of computer science have pointed out both the similarities and differences between these different forms of oppression, and in this work we draw out the implications for fair machine learning. While certainly not all aspects of fair machine learning need to be tailored to the specific form of oppression, there is a pressing need for greater attention to such specificity than is currently evident. Ultimately, context specificity can deepen our understanding of how to build more fair systems, widen our scope to include currently overlooked harms, and, almost paradoxically, also help to narrow our scope and counter the fear of an infinite number of group-specific methods of analysis.