Deep Learning Weekly - 🤖 - Issue #114: Detecting Photoshops, TensorFlow Text, Facebook's photo-realistic simulator, PizzaGAN and more...

Bringing you everything new and exciting in the world of
 deep learning from academia to the grubby depth
 of industry every week right to your inbox. Free.

Issue 114

Hey folks,

This week in deep learning we bring you a photoshop detector from Adobe, beauty try-on in YouTube, text tools in TensorFlow, and a car pose API.

You may also be interested in a new speech synthesis model from Google, weight agnostic neural networks, photo-realistic simulation environments from Facebook, and a GAN-based pizza maker.

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!

Industry

Adobe’s prototype AI tool automatically spots Photoshopped faces

Adobe is now making tools to figure out if someone used their tools to manipulate an image.
 

YouTube announces AR Beauty Try-On

YouTube viewers can use augmented reality and face tracking to try on virtual makeup and follow along with other creators.

 

Introducing tf.text

New tools for manipulating text and building NLP models with TensorFlow.

 

Applying AutoML to Transformer Architectures

Google applies evolution-based neural architecture search to find better transformer architectures for solving sequence-to-sequence problems.

 

Zensors releases Car Pose Net API

A 2D pose estimation model for tracking cars.

Learning

Audio samples from "Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis"

Results from Capacitron, Google’s latest iteration on the Tacotron speech synthesis model, are very impressive.

 

Weight Agnostic Neural Networks

A new training technique that searches for model architectures that can solve tasks rather than training weights.

 

How to make a pizza: Learning a compositional layer-based GAN model

Generating images by compositing layers created by GANs.

 

Techniques to Tackle Overfitting and Achieve Robustness for Donkey Car Neural Network Self-Driving Agent

Informative deep dive into very practical techniques to achieve better performance for self-driving RC cars.

Fun

[Video] Cats, Rats, A.I., Oh My! - Ben Hamm

Training a model to detect a cat trying to bring prey into the house.

Libraries & Code

[Github] keras/keras-tuner

A very early alpha for the Keras hyperparameter tuning project is up.

 

Facebook: Open-sourcing AI Habitat, an advanced simulation platform for embodied AI research

Facebook open sources tools to create photo-realistic environments that can be used to train agents via reinforcement learning.

 

[Github] bshall/UniversalVocoding

A PyTorch implementation of "Robust Universal Neural Vocoding"

Papers & Publications

Text2Scene: Generating Compositional Scenes from Textual Descriptions

Abstract: In this paper, we propose Text2Scene, a model that generates various forms of compositional scene representations from natural language descriptions. Unlike recent works, our method does NOT use Generative Adversarial Networks (GANs). Text2Scene instead learns to sequentially generate objects and their attributes (location, size, appearance, etc) at every time step by attending to different parts of the input text and the current status of the generated scene. We show that under minor modifications, the proposed framework can handle the generation of different forms of scene representations, including cartoon-like scenes, object layouts corresponding to real images, and synthetic images….

 

Stacked Capsule Autoencoders

Abstract: ….We describe an unsupervised version of capsule networks, in which a neural encoder, which looks at all of the parts, is used to infer the presence and poses of object capsules. The encoder is trained by backpropagating through a decoder, which predicts the pose of each already discovered part using a mixture of pose predictions. The parts are discovered directly from an image, in a similar manner, by using a neural encoder, which infers parts and their affine transformations. The corresponding decoder models each image pixel as a mixture of predictions made by affine-transformed parts. We learn object- and their part-capsules on unlabeled data, and then cluster the vectors of presences of object capsules….

For more deep learning news, tutorials, code, and discussion, join us on SlackTwitter, and GitHub.