Deep Learning Weekly Issue #118

Microsoft + OpenAI, FaceApp goes viral, AI-mapping tools from Facebook, PyTorch transformers and more!

Hey folks,

This week in deep learning we bring you Microsoft’s investment in OpenAI, two new tools from Facebook (AI generated OpenStreetMap data and a RL environment for Minecraft), and some words of warning about the latest viral face bending app.

You may also enjoy a review of image registration techniques, generating new watch designs with StyleGAN, some controversy over BERT results, a new data augmentation technique for image recognition, a library for training binarized neural networks, and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


Microsoft invests $1 billion in OpenAI

Behind the hyped language, OpenAI has agreed to transition entirely to Azure and help Microsoft develop new capabilities in exchange for a lot of compute.

Facebook speeds up mapping data validation with machine learning tools Map With AI and RapiD

Facebook has open-sourced a tool to automate digital map creation from satellite imagery. The new tool integrates directly with OpenStreetMaps.

Open-sourcing CraftAssist, a platform for studying collaborative AI bots in Minecraft

Facebook has also announced another open-source reinforcement learning tool. The digital sandbox for Minecraft makes it easier to record data and train interactive agents.

Why we released Grover

Researchers behind the Grover transformer talk about why they released their model in contrast to OpenAI’s decision to keep the large GPT-2 model proprietary.

FaceApp Makes Today’s Privacy Laws Look Antiquated

A gender, age, and expression bending app has gone viral, igniting backlash over data and privacy issues.


Image Registration: From SIFT to Deep Learning

A nice review of image registration models and recent progress with deep learning.

Probing Neural Network Comprehension of Natural Language Arguments

BERT performance on the Argument Reasoning Comprehension Task is likely entirely due to spurious correlation within the training data.

Generating New Watch Designs With StyleGAN

Great results training a StyleGAN model to generate watch designs with modest compute resources.

Neural network in glass requires no power, recognizes numbers

Researchers simulate specialized pieces of glass that exploit light’s wavefront to perform image recognition tasks.

Progressive Sprinkles (cutout variation) - Image segmentation data augmentation

A student develops a new data augmentation technique to achieve SOTA performance on a few recognition tasks.

This AI magically removes moving objects from videos

A combination of masking and in-painting removes moving object from videos. GitHub link in the article.

Libraries & Code

[Github] huggingface/pytorch-transformers

A comprehensive, well-organized set of transformer models implemented in PyTorch.

[Github] nuno-faria/tetris-ai

Using Q-Learning with Keras to train a model to play Tetris.

[Github] larq/larq

An Open-Source Library for Training Binarized Neural Networks

Papers & Publications

Coordinate-based Texture Inpainting for Pose-Guided Image Generation

Abstract: We present a new deep learning approach to pose-guided resynthesis of human photographs. At the heart of the new approach is the estimation of the complete body surface texture based on a single photograph. Since the input photograph always observes only a part of the surface, we suggest a new inpainting method that completes the texture of the human body. Rather than working directly with colors of texture elements, the inpainting network estimates an appropriate source location in the input image for each element of the body surface. The final convolutional network then uses the established correspondence and all other available information to synthesize the output image….

On the “steerability” of generative adversarial networks

Abstract:….We show that although current GANs can fit standard datasets very well, they still fall short of being comprehensive models of the visual manifold. In particular, we study their ability to fit simple transformations such as camera movements and color changes. We find that the models reflect the biases of the datasets on which they are trained (e.g., centered objects), but that they also exhibit some capacity for generalization: by "steering" in latent space, we can shift the distribution while still creating realistic images. We hypothesize that the degree of distributional shift is related to the breadth of the training data distribution, and conduct experiments that demonstrate this.