Deep Learning Weekly: Issue #208

AI to tackle climate change, Android ML platform, IceVision, AlphaFold, new funding for AI startups and more

Hey folks,

This week in deep learning, we bring you the release of Android ML platform, China’s AI ecosystem, deep-learning-based piano transcription, and large funding rounds for AI startups Ghost and Untether AI.

You may also enjoy an overview of the startups applying AI to tackle climate change, AlphaFold’s code release, IceVision, an object detection framework, Nvidia Canvas and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


These Are The Startups Applying AI To Tackle Climate Change

This article lists promising startups which have emerged to tackle climate change with AI, specialized in fields like climate intelligence, climate insurance, carbon offsets or carbon accounting.

Hard choices: AI in health care

This article examines two of the most pressing ethical considerations when it comes to AI applications in health care: the potential loss of physician autonomy and the amplification of underlying biases.

Cyber Valley researchers link AI and society

Nestled in the mountains bordering France and Switzerland, Cyber Valley was founded in 2016 by partners from government, science and industry. Today, it is an active research institute, working in particular on biases hidden in AI systems. 

ICML 2021 Workshop: Tackling Climate Change with Machine Learning

A workshop on how ML can help to tackle climate change was held on July 23rd at ICML, one of the most prestigious AI research conference.

China’s business ‘ecosystems’ are helping it win the global A.I. race

According to the authors, the ecosystem China has put in place to help its AI companies ensures it will win the global race against the US and Europe.

Ghost raises $100M Series D for autonomous driving and crash prevention tech

Ghost Locomotion has raised a $100 million Series D funding round: the money will be used toward R&D as the company continues to develop its highway self-driving and crash prevention technology.

Mobile & Edge

NVIDIA CEO Unveils ‘First Big Bet’ on Digital Biology Revolution with UK-Based Cambridge-1

Nvidia unveiled the U.K.’s most powerful supercomputer, Cambridge-1, that promises to harness partnerships for breakthroughs with a global impact in the digital biology field.

Announcing Android’s updatable, fully integrated ML inference stack

Google released Android ML Platform, a fully integrated on-device ML inference stack, ensuring optimal performance on all devices and providing a consistent API that spans Android versions.

MoViNets: Mobile Video Networks for Efficient Video Recognition

This paper presents Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Untether AI nabs $125M for AI acceleration chips

Untether AI develops custom-built chips for AI inference workloads. This round of funding will be used to make progress toward mass-producing its RunA1200 chip. The founders claim that data in their architecture moves up to 1,000 times faster than in other chips.


EvalAI: Evaluating state-of-the-art in AI

EvalAI is an open source platform for evaluating and comparing machine learning algorithms at scale. Its partners include the most prestigious academic labs.

Nvidia Canvas

Nvidia Canvas enables the use of AI to turn simple brushstrokes into realistic landscape images, to create backgrounds quickly, or to speed up concept explorations.

Learning to Extrapolate with Generative AI Models

This post introduces deep generative models that can extrapolate beyond their training distribution. This makes them much more adapted than existing generative models to certain tasks like the generation of natural language or protein sequences.

AlphaFold 2 is here: what’s behind the structure prediction miracle

A detailed and well-written introduction to the inner workings of DeepMind’s AlphaFold 2 algorithm. The author concludes with a personal view on its implications for academic research and biology.

Parallelizing neural networks on one GPU with JAX

This tutorial explains how to make the most of one GPU to train small neural networks by parallelizing training with JAX.

Libraries & Code

ONCE Dataset

The ONCE dataset is a large-scale autonomous driving dataset with 1 million LIDAR frames, 7 million camera images, 144 driving hours and 15k annotated scenes in diverse environments.


DeepMind released the code behind AlphaFold, its algorithm to predict the shape of a protein from its genetic sequence. Interestingly, it can be run with limited computational resources.

IceVision: An Agnostic Object Detection Framework

IceVision is the first agnostic computer vision framework. It offers a curated collection with hundreds of high-quality pre-trained models, and orchestrates the end-to-end deep learning workflow allowing to train networks with libraries such as Pytorch-Lightning and Fastai.

Papers & Publications

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better


Deep Learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, information retrieval and more. However, with the progressive improvements in deep learning models, their number of parameters, latency, resources required to train, etc. have all increased significantly. Consequently, it has become important to pay attention to these footprint metrics of a model as well, not just its quality. We present and motivate the problem of efficiency in deep learning, followed by a thorough survey of the five core areas of model efficiency (spanning modeling techniques, infrastructure, and hardware) and the seminal work there. We also present an experiment-based guide along with code, for practitioners to optimize their model training and deployment. We believe this is the first comprehensive survey in the efficient deep learning space that covers the landscape of model efficiency from modeling techniques to hardware support. Our hope is that this survey would provide the reader with the mental model and the necessary understanding of the field to apply generic efficiency techniques to immediately get significant improvements, and also equip them with ideas for further research and experimentation to achieve additional gains.

Evaluating Large Language Models Trained on Code


We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%. Furthermore, we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solve 70.2% of our problems with 100 samples per problem. Careful investigation of our model reveals its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, we discuss the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.

Sequence-to-Sequence Piano Transcription with Transformers


Automatic Music Transcription has seen significant progress in recent years by training custom deep neural networks on large datasets. However, these models have required extensive domain-specific design of network architectures, input/output representations, and complex decoding schemes. In this work, we show that equivalent performance can be achieved using a generic encoder-decoder Transformer with standard decoding methods. We demonstrate that the model can learn to translate spectrogram inputs directly to MIDI-like output events for several transcription tasks. This sequence-to-sequence approach simplifies transcription by jointly modeling audio features and language-like output dependencies, thus removing the need for task-specific architectures. These results point toward possibilities for creating new Music Information Retrieval models by focusing on dataset creation and labeling rather than custom model design.