Deep Learning Weekly Issue #107
Music generators, brain-to-speech synthesis, sparse generative transformation, and more...
This week in deep learning we bring you a neural network that generates music, a model to predict antibiotic resistance in tuberculosis, and speech synthesis from brain waves.
From there we recommend a recipe for training neural networks, notes on AI bias, a new generative transformer model, and an extremely clever use of people doing the “mannequin challenge” on Youtube.
As always happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Tuberculosis is notoriously difficult to target due to its ability to develop resistance to antibiotics. A new set of models by researchers at Harvard Medical school can predict resistance much earlier on in the drug discovery process.
OpenAI adapted it’s popular GPT-2 transformer architecture to generate some of the best musical compositions we’ve heard from neural networks.
Neuroscientists use deep neural networks synthesize speech from brain signals.
Facebook’s AI-powered content filters had never seen first-person shooter footage during training and missed the horrific Christchurch videos.
A fantastic set of heuristics, tips, and tricks to help anyone train a neural network.
A large scale evaluation of unsupervised scene disentanglement along with key insights into improving future models.
Building a deep learning model that will predict the probability of an employee leaving a company.
Benedict Evans provides some thoughts on bias in machine learning models.
Libraries & Code
Model explainability by examining the training examples most relevant to a particular output.
A slick looking open source tool for annotating images and videos.
PyTorch implementation of PlaNet: A Deep Planning Network for Reinforcement Learning
A PyTorch implementation of Google’s recent reinforcement learning technique.
A new Sparse Transformer neural network architecture that uses a novel attention mechanism to extract patterns from sequences 30x longer than possible previously.
In a fun project, developers trained RNNs on the written rules of 400 sports and then started generated rule sets of their own. After playing a few model-generated sports, they found one that stuck.
Papers & Publications
Abstract: We present a method for predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving....In this paper, we take a data-driven approach and learn human depth priors from a new source of data: thousands of Internet videos of people imitating mannequins, i.e., freezing in diverse, natural poses, while a hand-held camera tours the scene. Since the people are stationary, training data can be created from these videos using multi-view stereo reconstruction. At inference time, our method uses motion parallax cues from the static areas of the scenes, and shows clear improvement over state-of-the-art monocular depth prediction methods…
Abstract: Minimal outfit edits suggest minor changes to an existing outfit in order to improve its fashionability. For example, changes might entail (left) removing an accessory; (middle) changing to a blouse with higher neckline; (right) tucking in a shirt. Our model consists of a deep image generation neural network that learns to synthesize clothing conditioned on learned per-garment encodings. The latent encodings are explicitly factorized according to shape and texture, thereby allowing direct edits for both fit/presentation and color/patterns/material, respectively. We show how to bootstrap Web photos to automatically train a fashionability model, and develop an activation maximization-style approach to transform the input image into its more fashionable self…