Deep Learning Weekly: Issue #250
DeepMind's multi-model, multi-task, multi-embodiment generalist policy that resembles AGI, practical learnings for production-level deep learning models, and more.
Hey Folks,
This week in deep learning, we bring you DeepMind's multi-model, multi-task, multi-embodiment generalist policy that resembles AGI, practical learnings for production-level deep learning models, ambient clinical intelligence with PyTorch, and a paper on large language model reasoning through chain of thoughts.
You may also enjoy machine learning for protecting the Great Barrier Reef, multi-GPU training with PyTorch Lightning, fusing foundation model embeddings and weak supervision, a paper on self-adaptive thresholding for semi-supervised learning, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
DeepMind describes and documents the current capabilities of Gato, their new multi-modal, multi-task, multi-embodiment generalist policy.
Using Machine Learning to Help Protect the Great Barrier Reef in Partnership with Australia's CSIRO
Google teamed up with Australia’s national science agency to develop machine learning technology to transform the underwater survey, monitoring and mapping reefs at scale to help rapidly identify and prioritize COTS outbreaks.
Urban Jungle: AI-Generated Endangered Species Mix With Times Square's Nightlife
A deep learning art display is spotlighting lesser-known endangered creatures on Times Square billboards this month, appearing nightly in the few minutes before midnight across nearly 100 screens.
On the road to cleaner, greener, and faster driving
In a new study, MIT researchers demonstrate a machine-learning approach that can learn to control a fleet of autonomous vehicles as they approach and travel through a signalized intersection in a way that keeps traffic flowing smoothly.
Undetectable Backdoors Plantable In Any Machine-Learning Algorithm
According to a new study, undetectable backdoors can be planted into any machine learning algorithm, allowing a cybercriminal to gain unfettered access and to tamper with any of its data.
AI startup Hugging Face raises $100M in funding at $2B valuation
Hugging Face Inc., the operator of a popular platform for hosting artificial intelligence models, announced that it has closed a $100 million funding round led by Lux Capital.
MLOps
Multi GPU Model Training: Monitoring and Optimizing
A technical blog that discusses multi GPU training with PyTorch Lightning and the best practices that should be adopted to optimize the training process.
An article talking about why you should use GPUs for your end-to-end data science workflows – not just for model training and inference, but also for ETL jobs
Lessons From Deploying Deep Learning To Production
The CEO of Aquarium, a company that builds tools to find and fix problems in deep learning datasets, discusses his practical learnings on production-level models.
How to Integrate D3.js Graphs in a Comet Panel
A step-by-step tutorial on how to build very appealing panels in Comet using D3.js.
Learning
Ambient Clinical Intelligence: Generating Medical Reports with PyTorch
This article showcases how the Nuance Dragon Ambient eXperience, an ambient clinical intelligence solution, automatically documents patient encounters accurately and efficiently at the point of care and the technologies that enable it.
An Illustrated Tour of Applying BERT to Speech Data
A comprehensive article introducing the wav2vec 2.0 and HuBERT models.
Portrait Depth API: Turning a Single Image into a 3D Photo with TensorFlow.js
An article that highlights how to use the Depth API, the first depth estimation API from TensorFlow.js.
All you need to know about Graph Attention Networks
A blog that discusses the advantages and architecture of Graph Attention Networks.
Liger: Fusing foundation model embeddings & weak supervision
Mayee Chen, a PhD student in Computer Science at Stanford University, discusses her work which combines weak supervision and foundation model embeddings to improve two essential aspects of current weak supervision techniques.
Libraries & Code
Welcome fastai to the Hugging Face Hub
A notebook showcasing how fastai practitioners can share and upload models to Hugging Face Hub.
spotify/basic-pitch: A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Basic Pitch is a Python library for Automatic Music Transcription (AMT), using a lightweight neural network developed by Spotify's Audio Intelligence Lab.
Papers & Publications
Chain of Thought Prompting Elicits Reasoning in Large Language Models
Abstract:
Although scaling up language model size has reliably improved performance on a range of NLP tasks, even the largest models currently struggle with certain reasoning tasks such as math word problems, symbolic manipulation, and common sense reasoning. This paper explores the ability of language models to generate a coherent chain of thought -- a series of short sentences that mimic the reasoning process a person might have when responding to a question. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks that otherwise have flat scaling curves. When combined with the 540B parameter PaLM model, chain of thought prompting achieves a new state of the art of 58.1% on the GSM8K benchmark of math word problems.
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning
Abstract:
Pseudo labeling and consistency regularization approaches with confidence-based thresholding have made great progress in semi-supervised learning (SSL). In this paper, we theoretically and empirically analyze the relationship between the unlabeled data distribution and the desirable confidence threshold. Our analysis shows that previous methods might fail to define a favorable threshold since they either require a pre-defined / fixed threshold or an ad-hoc threshold adjusting scheme that does not reflect the learning effect well, resulting in inferior performance and slow convergence, especially for complicated unlabeled data distributions. We hence propose \emph{FreeMatch} to define and adjust the confidence threshold in a self-adaptive manner according to the model's learning status. To handle complicated unlabeled data distributions more effectively, we further propose a self-adaptive class fairness regularization method that encourages the model to produce diverse predictions during training. Extensive experimental results indicate the superiority of FreeMatch especially when the labeled data are extremely rare. FreeMatch achieves 5.78%, 13.59%, and 1.28% error rate reduction over the latest state-of-the-art method FlexMatch on CIFAR-10 with 1 label per class, STL-10 with 4 labels per class, and ImageNet with 100k labels respectively.