Deep Learning Weekly: Issue #298
A Worm-inspired Liquid Neural Network for Drones, More Design Patterns For ML Systems, A Dialogue Model for Academic Research, a paper on Fundamental Limitations of Alignment in Large Language Models,
This week in deep learning, we bring you A Worm-inspired Liquid Neural Network for Flying Drones, More Design Patterns For Machine Learning Systems, A Dialogue Model for Academic Research, and a paper on Fundamental Limitations of Alignment in Large Language Models.
You may also enjoy The Stickle-Brick Approach to Big AI, Logging for ML Model Deployments, AI Anthropomorphism, Large Language Models Are Human-Level Prompt Engineers, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
A worm-inspired liquid neural network helps drones fly
Using liquid neural networks, researchers at the Massachusetts Institute of Technology have trained a drone to identify and navigate toward objects in varying environments.
Comet brings prompt tuning tools for large language model development to its platform
As prompt engineering becomes increasingly complex, the need for robust MLOps practices becomes critical and the new features built by Comet streamlines the machine learning lifecycle for prompt management.
Announcing New Tools for Building with Generative AI on AWS
AWS has announced new tools for building generative AI using foundation models from AI21 Labs, Anthropic, Stability AI, and Amazon.
Stability AI Launches the First of its StableLM Suite of Language Models
Stability AI released a new open-source language model called StableLM, which is currently available in 3 billion and 7 billion parameters.
The Stickle-Brick Approach To Big AI
Large language models could improve their performance by outsourcing tasks to specialist AIs.
Andrew Ng’s Landing AI makes it easier to create computer vision apps with Visual Prompting
Landing AI just announced the launch of Visual Prompting, which helps make it easier for users to build computer vision applications.
Weaviate reels in $50M for its AI-optimized vector database
Open-source database startup Weaviate announced that it has secured a $50 million investment led by Index Ventures.
MLOps
Introduction to Human Action Recognition (HAR)
This article is a hands-on tutorial about training models to automatically recognize the actions performed by humans based on their movement patterns and appearance features.
More Design Patterns For Machine Learning Systems
The article discusses nine design patterns in machine learning systems such as Reframing, Cascade, Business Rules, Evaluate Before Deploy, and others.
Logging for ML Model Deployments
A comprehensive guide on how to do logging around an MLModel instance using the decorator pattern.
Building a large scale unsupervised model anomaly detection system
A post that focuses on how Lyft utilizes the compute layer of LyftLearn to profile model features and predictions, and perform anomaly detection at scale.
Create SageMaker Pipelines for training, consuming and monitoring your batch use cases
A post that shows how to create repeatable pipelines for your batch use cases using Amazon SageMaker Pipelines and other tools.
Traceability & Reproducibility
A post that shows how to create repeatable pipelines for your batch use cases using Amazon SageMaker Pipelines and other tools.
Learning
Understanding the Epistemic Uncertainty in Deep Learning
An article that provides an overview of epistemic uncertainty in deep learning.
Koala: A Dialogue Model for Academic Research
A post that introduces Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web.
LangChain & GPT-4 for Code Understanding: Twitter Algorithm
A walkthrough guide on how to use LangChain, Deep Lake, and GPT-4 to understand complex codebases like Twitter's recommendation algorithm.
How to Generate Real-World Synthetic Data with CTGAN
An article that discusses the working principles of CTGAN and explores synthetic data using a Streamlit app.
The article discusses anthropomorphism in AI and how it can be problematic when people give human-like qualities or behavior to non-human entities such as animals, objects, or natural phenomena.
Libraries & Code
Open-source package for accelerated symbolic discovery of fundamental laws.
A repository for Pythia, which combines interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers.
A library that provides solutions for ML practitioners working to create and train models in a way that reduces or eliminates user harm resulting from underlying performance biases.
Papers & Publications
Large Language Models Are Human-Level Prompt Engineers
Abstract:
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis and the human approach to prompt engineering, we propose Automatic Prompt Engineer (APE) for automatic instruction generation and selection. In our method, we treat the instruction as the "program," optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate the quality of the selected instruction, we evaluate the zero-shot performance of another LLM following the selected instruction. Experiments on 24 NLP tasks show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 19/24 tasks. We conduct extensive qualitative and quantitative analyses to explore the performance of APE. We show that APE-engineered prompts can be applied to steer models toward truthfulness and/or informativeness, as well as to improve few-shot learning performance by simply prepending them to standard in-context learning prompts.
Fundamental Limitations of Alignment in Large Language Models
Abstract:
An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users. This is usually achieved by tuning the model in a way that enhances desired behaviors and inhibits undesired ones, a process referred to as alignment. In this paper, we propose a theoretical approach called Behavior Expectation Bounds (BEB) which allows us to formally investigate several inherent characteristics and limitations of alignment in large language models. Importantly, we prove that for any behavior that has a finite probability of being exhibited by the model, there exist prompts that can trigger the model into outputting this behavior, with probability that increases with the length of the prompt. This implies that any alignment process that attenuates undesired behavior but does not remove it altogether, is not safe against adversarial prompting attacks. Furthermore, our framework hints at the mechanism by which leading alignment approaches such as reinforcement learning from human feedback increase the LLM's proneness to being prompted into the undesired behaviors. Moreover, we include the notion of personas in our BEB framework, and find that behaviors which are generally very unlikely to be exhibited by the model can be brought to the front by prompting the model to behave as specific persona. This theoretical result is being experimentally demonstrated in large scale by the so called contemporary "chatGPT jailbreaks", where adversarial users trick the LLM into breaking its alignment guardrails by triggering it into acting as a malicious persona. Our results expose fundamental limitations in alignment of LLMs and bring to the forefront the need to devise reliable mechanisms for ensuring AI safety.
Naturalistic Head Motion Generation From Speech
Abstract:
Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience. Most prior works assess the quality of generated head motion by comparing them against a single ground-truth using an objective metric. Yet there are many plausible head motion sequences to accompany a speech utterance. In this work, we study the variation in the perceptual quality of head motions sampled from a generative model. We show that, despite providing more diverse head motions, the generative model produces motions with varying degrees of perceptual quality. We finally show that objective metrics commonly used in previous research do not accurately reflect the perceptual quality of generated head motions. These results open an interesting avenue for future work to investigate better objective metrics that correlate with human perception of quality.
thank you