Deep Learning Weekly Issue #123

An AI chip from Huawei, a deep dive into Oculus, more efficient transformers, new tools from DeepMind and more

Hey folks,

This week in deep learning we bring you a new AI chip from Huawei, a look at the AI models inside the Oculus Quest, the 774M parameter GPT-2 model, and a new project from Google that uses AR to (finally) put Harriet Tubman on the $20 bill.

You may also enjoy a new loss function for training models on noise-y data, a Keras API for training models on encrypted data, a method for training more efficient transformers, and a look at what neural networks can learn from animal brains.

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


Google releases “Notable Women” App

Google has launched a new app that uses AR technology to superimpose portraits of famous women onto US currency.

Huawei’s First Commercial AI Chip Doubles the Training Performance of Nvidia’s Flagship GPU

Another contender enters the AI chip race.

Powered by AI: Oculus Insight

Facebook goes into the details of some of the ways they use AI-based algorithms in the Oculus Quest.

Amazon’s voice-synthesizing AI mimics shifts in tempo, pitch, and volume

New research from Amazon on improving the quality of voice synthesis to make it more human-like.

GPT-2: 6-Month Follow-Up

OpenAI releases 774M parameter GPT-2 model, still hasn’t released the largest one.


Bi-Tempered Logistic Loss for Training Neural Nets with Noisy Data

Researchers from Google propose a new loss function for robust training even given noisey training data.

Encrypted Deep Learning Training and Predictions with TF Encrypted Keras

A high-level Keras API for training models on encrypted data.

Making Transformer networks simpler and more efficient

Facebook’s AI team introduces adaptive span attention and all-layer attention to make transformers more computationally and memory efficient.

Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT

Another technique (distillation) for shrinking transformers from the HuggingFace team.

A speech synthesis survey.

A nice overview of various speech synthesis techniques.

Libraries & Code

[GitHub] CorentinJ/Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

[GitHub] deepmind/open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

[GitHub] dbolya/yolact

A simple, fully convolutional model for real-time instance segmentation.

Papers & Publications

Progressive Face Super-Resolution via Attention to Facial Landmark

Abstract: Face Super-Resolution (SR) is a subfield of the SR domain that specifically targets the reconstruction of face images. The main challenge of face SR is to restore essential facial features without distortion. We propose a novel face SR method that generates photo-realistic 8x super-resolved face images with fully retained facial details. To that end, we adopt a progressive training method, which allows stable training by splitting the network into successive steps, each producing output with a progressively higher resolution. We also propose a novel facial attention loss and apply it at each step to focus on restoring facial attributes in greater details by multiplying the pixel difference and heatmap values. Lastly, we propose a compressed version of the state-of-the-art face alignment network (FAN) for landmark heatmap extraction….

A critique of pure learning and what artificial neural networks can learn from animal brains

Abstract: Artificial neural networks (ANNs) have undergone a revolution, catalyzed by better supervised learning algorithms. However, in stark contrast to young animals (including humans), training such networks requires enormous numbers of labeled examples, leading to the belief that animals must rely instead mainly on unsupervised learning. Here we argue that most animal behavior is not the result of clever learning algorithms—supervised or unsupervised—but is encoded in the genome. Specifically, animals are born with highly structured brain connectivity, which enables them to learn very rapidly. Because the wiring diagram is far too complex to be specified explicitly in the genome, it must be compressed through a “genomic bottleneck”. The genomic bottleneck suggests a path toward ANNs capable of rapid learning.