Deep Learning Weekly Issue #107

Music generators, brain-to-speech synthesis, sparse generative transformation, and more...

Hey folks,

This week in deep learning we bring you a neural network that generates music, a model to predict antibiotic resistance in tuberculosis, and speech synthesis from brain waves.

From there we recommend a recipe for training neural networks, notes on AI bias, a new generative transformer model, and an extremely clever use of people doing the “mannequin challenge” on Youtube.

As always happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


Harvard AI determines when tuberculosis becomes resistant to common drugs [VentureBeat]

Tuberculosis is notoriously difficult to target due to its ability to develop resistance to antibiotics. A new set of models by researchers at Harvard Medical school can predict resistance much earlier on in the drug discovery process.

MuseNet - A deep neural network for generating long compositions. [OpenAI]

OpenAI adapted it’s popular GPT-2 transformer architecture to generate some of the best musical compositions we’ve heard from neural networks.

Brain signals translated into speech using artificial intelligence [New York Times]

Neuroscientists use deep neural networks synthesize speech from brain signals.

Facebook's AI missed Christchurch shooting videos filmed in first-person [Engadget]

Facebook’s AI-powered content filters had never seen first-person shooter footage during training and missed the horrific Christchurch videos.


A Recipe for Training Neural Networks

A fantastic set of heuristics, tips, and tricks to help anyone train a neural network.

Evaluating the Unsupervised Learning of Disentangled Representations

A large scale evaluation of unsupervised scene disentanglement along with key insights into improving future models.

How To Build a Deep Learning Model to Predict Employee Retention Using Keras and TensorFlow

Building a deep learning model that will predict the probability of an employee leaving a company.

Benedict Evans: Notes on AI-bias

Benedict Evans provides some thoughts on bias in machine learning models.

Libraries & Code

Representer Point Selection for Explaining Deep Neural Networks

Model explainability by examining the training examples most relevant to a particular output.

Microsoft’s VoTT (Visual Object Tagging Tool)

A slick looking open source tool for annotating images and videos.

PyTorch implementation of PlaNet: A Deep Planning Network for Reinforcement Learning

A PyTorch implementation of Google’s recent reinforcement learning technique.

Generative Modeling with Sparse Transformers

A new Sparse Transformer neural network architecture that uses a novel attention mechanism to extract patterns from sequences 30x longer than possible previously.


Speedgate: World’s First Sport Generated by AI [AKQA]

In a fun project, developers trained RNNs on the written rules of 400 sports and then started generated rule sets of their own. After playing a few model-generated sports, they found one that stuck.

Papers & Publications

Learning the Depths of Moving People by Watching Frozen People

Abstract: We present a method for predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving....In this paper, we take a data-driven approach and learn human depth priors from a new source of data: thousands of Internet videos of people imitating mannequins, i.e., freezing in diverse, natural poses, while a hand-held camera tours the scene. Since the people are stationary, training data can be created from these videos using multi-view stereo reconstruction. At inference time, our method uses motion parallax cues from the static areas of the scenes, and shows clear improvement over state-of-the-art monocular depth prediction methods…

Fashion++: Minimal Edits for Outfit Improvement

Abstract: Minimal outfit edits suggest minor changes to an existing outfit in order to improve its fashionability. For example, changes might entail (left) removing an accessory; (middle) changing to a blouse with higher neckline; (right) tucking in a shirt. Our model consists of a deep image generation neural network that learns to synthesize clothing conditioned on learned per-garment encodings. The latent encodings are explicitly factorized according to shape and texture, thereby allowing direct edits for both fit/presentation and color/patterns/material, respectively. We show how to bootstrap Web photos to automatically train a fashionability model, and develop an activation maximization-style approach to transform the input image into its more fashionable self…