Deep Learning Weekly : Issue #312
CSAIL's PhotoGuard, Design Patterns for LLM Systems & Products, Training Diffusion Models with RL, a paper on LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition, and many more!
This week in deep learning, we bring you CSAIL's PhotoGuard to protect against AI image manipulation, Eugene Yan's Design Patterns for LLM Systems & Products, Training Diffusion Models with Reinforcement Learning, and a paper on LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition.
You may also enjoy Frontier Model Forum, 7 Ways to Monitor Large Language Model Behavior, A Gentle Introduction to Bayesian Deep Learning, a paper on Universal and Transferable Adversarial Attacks on Aligned Language Models, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
Anthropic, Google, Microsoft, and OpenAI are launching the Frontier Model Forum, an industry body focused on ensuring safe and responsible development of frontier AI models.
Using AI to protect against AI image manipulation
CSAIL developed PhotoGuard, a technique that uses perturbations that effectively disrupt the model’s ability to manipulate the image.
NeurIPS presents an LLM efficiency challenge – to tackle transparency, evaluation, and hardware challenges, and to democratize access to state of the art LLMs.
Programmatic Custom Model Creation
The Cohere team introduced programmatic custom model creation (beta) with their Python SDK.
A simpler method for learning to control a robot
Researchers from MIT and Stanford created a machine learning method that can derive a controller for a robot, drone, or autonomous vehicle that is more effective than other methods.
Seattle startup that helps companies protect their AI and machine learning code raises $35M
Seattle cybersecurity startup Protect AI landed $35 million to boost the rollout of its platform that helps enterprises shore up their machine learning code.
MLOps
A comprehensive guide into LLMs inference and serving with detailed comparison.
7 Ways to Monitor Large Language Model Behavior
A blog that discusses seven groups of metrics you can use to keep track of LLM’s behaviors.
From hackathon to production: Unveiling Etsy’s image search revolution
In this engaging fireside chat, Senior Engineers at Etsy, come together to discuss the development and implementation of an innovative image search product at Etsy.
Design Patterns for LLM Systems & Products
Eugene Yan’s post about practical patterns for integrating large language models (LLMs) into systems and products.
How Comet Can Serve Your LLM Project from Pre-Training to Post-Deployment
An article that explores how Comet can be useful for training, developing, and deploying large-scale machine learning models.
Learning
A Gentle Introduction to Bayesian Deep Learning
An article that gently introduces the field of Bayesian Deep Learning.
Developing a Pallet Detection Model Using OpenUSD and Synthetic Data
A blog post that describes how to develop a pallet detection model using OpenUSD and synthetic data.
Training Diffusion Models with Reinforcement Learning
A post that shows how diffusion models can be trained on these downstream objectives directly using reinforcement learning (RL).
Libraries & Code
The open-source tool for building high-quality datasets and computer vision models.
Pose estimation models implemented in Pytorch Lightning, supporting massively accelerated training on unlabeled videos using NVIDIA DALI.
Official repo for Lamini's finetuning pipeline, so you can train custom models on your data.
Papers & Publications
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Abstract:
Low-rank adaptations (LoRA) are often employed to fine-tune large language models (LLMs) for new tasks. This paper investigates LoRA composability for cross-task generalization and introduces LoraHub, a strategic framework devised for the purposive assembly of LoRA modules trained on diverse given tasks, with the objective of achieving adaptable performance on unseen tasks. With just a few examples from a novel task, LoraHub enables the fluid combination of multiple LoRA modules, eradicating the need for human expertise. Notably, the composition requires neither additional model parameters nor gradients. Our empirical results, derived from the Big-Bench Hard (BBH) benchmark, suggest that LoraHub can effectively mimic the performance of in-context learning in few-shot scenarios, excluding the necessity of in-context examples alongside each inference input. A significant contribution of our research is the fostering of a community for LoRA, where users can share their trained LoRA modules, thereby facilitating their application to new tasks. We anticipate this resource will widen access to and spur advancements in general intelligence as well as LLMs in production.
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Abstract:
We study how vision-language models trained on Internet-scale data can be incorporated directly into end-to-end robotic control to boost generalization and enable emergent semantic reasoning. Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web. To this end, we propose to co-fine-tune state-of-the-art vision-language models on both robotic trajectory data and Internet-scale vision-language tasks, such as visual question answering. In contrast to other approaches, we propose a simple, general recipe to achieve this goal: in order to fit both natural language responses and robotic actions into the same format, we express the actions as text tokens and incorporate them directly into the training set of the model in the same way as natural language tokens. We refer to such category of models as vision-language-action models (VLA) and instantiate an example of such a model, which we call RT-2. Our extensive evaluation (6k evaluation trials) shows that our approach leads to performant robotic policies and enables RT-2 to obtain a range of emergent capabilities from Internet-scale training. This includes significantly improved generalization to novel objects, the ability to interpret commands not present in the robot training data (such as placing an object onto a particular number or icon), and the ability to perform rudimentary reasoning in response to user commands (such as picking up the smallest or largest object, or the one closest to another object). We further show that incorporating chain of thought reasoning allows RT-2 to perform multi-stage semantic reasoning, for example figuring out which object to pick up for use as an improvised hammer (a rock), or which type of drink is best suited for someone who is tired (an energy drink).
Universal and Transferable Adversarial Attacks on Aligned Language Models
Abstract:
Because "out-of-the-box" large language models are capable of generating a great deal of objectionable content, recent work has focused on aligning these models in an attempt to prevent undesirable generation. While there has been some success at circumventing these measures -- so-called "jailbreaks" against LLMs -- these attacks have required significant human ingenuity and are brittle in practice. In this paper, we propose a simple and effective attack method that causes aligned language models to generate objectionable behaviors. Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer). However, instead of relying on manual engineering, our approach automatically produces these adversarial suffixes by a combination of greedy and gradient-based search techniques, and also improves over past automatic prompt generation methods.
Surprisingly, we find that the adversarial prompts generated by our approach are quite transferable, including to black-box, publicly released LLMs. Specifically, we train an adversarial attack suffix on multiple prompts (i.e., queries asking for many different types of objectionable content), as well as multiple models (in our case, Vicuna-7B and 13B). When doing so, the resulting attack suffix is able to induce objectionable content in the public interfaces to ChatGPT, Bard, and Claude, as well as open source LLMs such as LLaMA-2-Chat, Pythia, Falcon, and others. In total, this work significantly advances the state-of-the-art in adversarial attacks against aligned language models, raising important questions about how such systems can be prevented from producing objectionable information.