Deep Learning Weekly: Issue #229
A machine learning and game-theory model for animal poaching, DeepMind's 280-billion-parameter model, new datasets to democratize speech recognition, a paper on the risks from language models, & more
This week in deep learning, we bring you a machine learning and game-theory model for animal poaching, DeepMind's 280-billion-parameter model named Gopher, new datasets to democratize speech recognition and a paper on the ethical and social risks from language models.
You may also enjoy Meta's AI method for bringing hand-drawn figures to life, the top five edge AI trends to watch in 2022, a deep dive into the implementation of Perceiver IO, a paper on partially local federated learning, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
AI Is Helping to Stop Animal Poaching and Food Insecurity
The machine learning system, dubbed PAWS (Protection Assistant for Wildlife Security), uses data from past patrols to predict where poaching is likely to occur and a game-theory model to help generate randomized, unpredictable patrol routes.
DeepMind tests the limits of large AI language systems with 280-billion-parameter model
DeepMind, which regularly feeds its work into Google products, has probed the capabilities of large language models by building a language model with 280 billion parameters, named Gopher.
Using AI to bring children's drawings to life
Meta announces a first-of-its-kind method for automatically animating children’s hand-drawn figures of people and humanlike characters that bring these drawings to life in a matter of minutes using AI.
OctoML introduces ultra-efficient AI models in latest platform release
OctoML Inc. introduced a new release of its artificial intelligence platform that includes a collection of highly efficient neural networks.
AI Dungeon's creator Latitude launches new Voyage game platform - The Verge
Latitude, the startup behind the GPT-3 based text game called AI Dungeon, is expanding into a new AI-powered game platform called Voyage.
Mobile & Edge
Top 5 Edge AI Trends to Watch in 2022
A short blog highlighting the top five edge AI trends NVIDIA expects to see in 2022.
Efficient PyTorch: Tensor Memory Format Matters
A technical and comprehensive article that dives into matrix storage/memory representation, introduces Cachegrind, explains memory formats supported by PyTorch Operators, and showcases best practices for model execution with XNNPACK.
Edge Impulse Announces Series B Funding, Scaling Edge ML for Developers and Enterprises Everywhere
Edge Impulse is announcing $34 million in Series B funding led by Coatue, tripling its 2022 market valuation and growth forecast.
New Datasets to Democratize Speech Recognition Technology
MLCommons set out to create public datasets to ease two pressing bottlenecks for open source speech recognition resources.
Perceiver IO: a scalable, fully-attentional model that works on any modality
A deep dive into the implementation of Perceiver IO, the first Transformer-based neural network that works on all kinds of modalities and combinations thereof.
AI and the Future of Work: What We Know Today
A comprehensive study explaining how AI will fundamentally affect the nature of work in the near future.
Training CodeParrot 🦜 from Scratch
A step-by-step guide on how to train a large GPT-2 model called CodeParrot, entirely from scratch.
Libraries & Code
Determined: Deep Learning Training Platform
Determined is an open-source deep learning training platform that makes building models fast and easy.
Picovoice/picovoice: The end-to-end platform for building voice products at scale
Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Google services, Picovoice runs entirely on-device while being more accurate.
murthylab/sleap: A deep learning framework for multi-animal pose tracking.
SLEAP is an open source deep-learning based framework for estimating positions of animal body parts.
Papers & Publications
Improving language models by retrieving from trillions of tokens
We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. With a 2 trillion token database, our Retrieval-Enhanced Transformer (RETRO) obtains comparable performance to GPT-3 and Jurassic-1 on the Pile, despite using 25× fewer parameters. After fine-tuning, RETRO performance translates to downstream knowledge-intensive tasks such as question answering. RETRO combines a frozen Bert retriever, a differentiable encoder and a chunked cross-attention mechanism to predict tokens based on an order of magnitude more data than what is typically consumed during training. We typically train RETRO from scratch, yet can also rapidly RETROfit pre-trained transformers with retrieval and still achieve good performance. Our work opens up new avenues for improving language models through explicit memory at unprecedented scale.
Ethical and social risks of harm from Language Models
This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguistics, and social sciences.
We outline six specific risk areas: I. Discrimination, Exclusion and Toxicity, II. Information Hazards, III. Misinformation Harms, V. Malicious Uses, V. Human-Computer Interaction Harms, VI. Automation, Access, and Environmental Harms. The first area concerns the perpetuation of stereotypes, unfair discrimination, exclusionary norms, toxic language, and lower performance by social group for LMs. The second focuses on risks from private data leaks or LMs correctly inferring sensitive information. The third addresses risks arising from poor, false or misleading information including in sensitive domains, and knock-on risks such as the erosion of trust in shared information. The fourth considers risks from actors who try to use LMs to cause harm. The fifth focuses on risks specific to LLMs used to underpin conversational agents that interact with human users, including unsafe use, manipulation or deception. The sixth discusses the risk of environmental harm, job automation, and other challenges that may have a disparate effect on different social groups or communities.
In total, we review 21 risks in-depth. We discuss the points of origin of different risks and point to potential mitigation approaches. Lastly, we discuss organisational responsibilities in implementing mitigations, and the role of collaboration and participation. We highlight directions for further research, particularly on expanding the toolkit for assessing and evaluating the outlined risks in LMs
Federated Reconstruction: Partially Local Federated Learning
Personalization methods in federated learning aim to balance the benefits of federated and local training for data availability, communication cost, and robustness to client heterogeneity. Approaches that require clients to communicate all model parameters can be undesirable due to privacy and communication constraints. Other approaches require always-available or stateful clients, impractical in large-scale cross-device settings. We introduce Federated Reconstruction, the first model-agnostic framework for partially local federated learning suitable for training and inference at scale. We motivate the framework via a connection to model-agnostic meta learning, empirically demonstrate its performance over existing approaches for collaborative filtering and next word prediction, and release an open-source library for evaluating approaches in this setting. We also describe the successful deployment of this approach at scale for federated collaborative filtering in a mobile keyboard application.