|March 8 · Issue #31 · View online |
Hi folks,This week’s issue is chockful of awesome resources, there is a massive audio data set, a Google AI that can pinpoint a location from image more accurately than humans, an affordable NVIDIA card, a Deep Learning setup with Kubernetes and fascinating research papers including one putting an end to the violent ethos of GANs.
As always we if you enjoy receiving this newsletter, you can help us out by sharing it with your friends and colleagues.
| NVIDIA Unveils GeForce GTX 1080 Ti: Available for $699 |
NVIDIA’s new card overs 11GB instead of the 12GB VRAM found in the Titan X, however, it does so at half the price!
| Deep Voice: Real-Time Neural Text-to-Speech for Production |
Baidu Research presents Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks. They were able to achieve audio synthesis in real-time, which amounts to an up to 400X speedup over previous WaveNet inference implementations. You can find the corresponding paper here
| Google Unveils Neural Network with “Superhuman” Ability to Determine the Location of Almost Any Image |
This is one of these obvious (and fun) application of deep learning that you kick yourself for not thinking about it. Google’s AI PlaNet is able to localize 3.6 percent of the images in the street-view corpus at street-level accuracy and 10.1 percent at city-level accuracy., moreover, it determines the country of origin in 28.4 percent of the photos and the continent in 48.0 percent of them.
While this might not sound super impressive at first blush, PlaNet consistently beat human adversaries and if you’re still not convinced try and beat these stats on www.geoguessr.com.
for the research paper.
| Three Things You Need to Know About Machine Learning |
Great survey of the machine learning landscape and the opportunities and risks associated with recent advances. Towards the end, the author and investor at Redpoint Ventures, Medha Agarwal, gives an outlook on how deep learning will change the user experience of software products. This is an aspect that we feel is still discussed too little, but that cuts across every product and vertical. If you are UX/UI/product person you should get on that ;)
| State of Hyperparameter Selection |
If you are a practitioner who uses either default hyperparameters, grid search, random search to select their model’s hyperparameters this post is for you.
| Introducing Similarity Search at Flickr |
An in-depth article on how Flickr implemented similarity search providing a great example of how to combine several machine learning techniques to arrive at a powerful production grade system.
| GPUs & Kubernetes for Deep Learning |
An extensive tutorial on how to set up a Kubernetes based infrastructure for deep learning. Part 2
& Part 3
| TensorFlow: How to Optimise Your Input Pipeline With Queues and Multi-threading |
If you are using the feed_dict system to train your Tensorflow models you might be starving your GPUs of data. This post shows you how to avoid this performance problem.
| fastText: Pretrained Vectors for 90 Languages |
Facebook published pre-trained word vectors for 90 languages, trained on Wikipedia using fastText. These vectors in dimension 300 were obtained using the skip-gram model described in Bojanowski et al. (2016) with default parameters.
| Faiss: A Library for Efficient Similarity Search and Clustering of Dense Vectors. |
Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.
| AudioSet: A Massive Dataset of Manually Annotated Audio Events |
Google’s AudioSet consists of an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds.
| Deep and Hierarchical Implicit Models |
This is a rich paper in which the authors develop two families of models: hierarchical implicit models and deep implicit models. They combine the idea of implicit densities with hierarchical Bayesian modeling and deep neural networks. These methods scale up implicit models to sizes previously not possible and open the door to new modeling designs.
| Deep Forest: Towards An Alternative to Deep Neural Networks |
An important paper outlining a decision tree ensemble approach with performance highly competitive to deep neural networks. In addition, the authors claim an array of advantages over deep neural networks such easier training and hyperparameter selection, need for fewer data and better interpretability.
| On the Origin of Deep Learning |
An overview of the history of deep learning models. Great to brush up on the fundamentals.
| Stopping GAN Violence: Generative Unadversarial Networks |
“Under this framework, we simultaneously train two models: a generator G that does its best to capture whichever data distribution it feels it can manage, and a motivator M that helps G to achieve its dream.”