Deep Learning Weekly Issue #123
An AI chip from Huawei, a deep dive into Oculus, more efficient transformers, new tools from DeepMind and more
This week in deep learning we bring you a new AI chip from Huawei, a look at the AI models inside the Oculus Quest, the 774M parameter GPT-2 model, and a new project from Google that uses AR to (finally) put Harriet Tubman on the $20 bill.
You may also enjoy a new loss function for training models on noise-y data, a Keras API for training models on encrypted data, a method for training more efficient transformers, and a look at what neural networks can learn from animal brains.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Google has launched a new app that uses AR technology to superimpose portraits of famous women onto US currency.
Another contender enters the AI chip race.
Facebook goes into the details of some of the ways they use AI-based algorithms in the Oculus Quest.
New research from Amazon on improving the quality of voice synthesis to make it more human-like.
OpenAI releases 774M parameter GPT-2 model, still hasn’t released the largest one.
Researchers from Google propose a new loss function for robust training even given noisey training data.
A high-level Keras API for training models on encrypted data.
Facebook’s AI team introduces adaptive span attention and all-layer attention to make transformers more computationally and memory efficient.
Another technique (distillation) for shrinking transformers from the HuggingFace team.
A nice overview of various speech synthesis techniques.
Libraries & Code
Clone a voice in 5 seconds to generate arbitrary speech in real-time
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
A simple, fully convolutional model for real-time instance segmentation.
Papers & Publications
Abstract: Face Super-Resolution (SR) is a subfield of the SR domain that specifically targets the reconstruction of face images. The main challenge of face SR is to restore essential facial features without distortion. We propose a novel face SR method that generates photo-realistic 8x super-resolved face images with fully retained facial details. To that end, we adopt a progressive training method, which allows stable training by splitting the network into successive steps, each producing output with a progressively higher resolution. We also propose a novel facial attention loss and apply it at each step to focus on restoring facial attributes in greater details by multiplying the pixel difference and heatmap values. Lastly, we propose a compressed version of the state-of-the-art face alignment network (FAN) for landmark heatmap extraction….
Abstract: Artificial neural networks (ANNs) have undergone a revolution, catalyzed by better supervised learning algorithms. However, in stark contrast to young animals (including humans), training such networks requires enormous numbers of labeled examples, leading to the belief that animals must rely instead mainly on unsupervised learning. Here we argue that most animal behavior is not the result of clever learning algorithms—supervised or unsupervised—but is encoded in the genome. Specifically, animals are born with highly structured brain connectivity, which enables them to learn very rapidly. Because the wiring diagram is far too complex to be specified explicitly in the genome, it must be compressed through a “genomic bottleneck”. The genomic bottleneck suggests a path toward ANNs capable of rapid learning.