Deep Learning Weekly Issue #114
Detecting Photoshops, TensorFlow Text, Facebook's photo-realistic simulator, PizzaGAN and more...
This week in deep learning we bring you a photoshop detector from Adobe, beauty try-on in YouTube, text tools in TensorFlow, and a car pose API.
You may also be interested in a new speech synthesis model from Google, weight agnostic neural networks, photo-realistic simulation environments from Facebook, and a GAN-based pizza maker.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Adobe’s prototype AI tool automatically spots Photoshopped faces
Adobe is now making tools to figure out if someone used their tools to manipulate an image.
YouTube announces AR Beauty Try-On
YouTube viewers can use augmented reality and face tracking to try on virtual makeup and follow along with other creators.
New tools for manipulating text and building NLP models with TensorFlow.
Applying AutoML to Transformer Architectures
Google applies evolution-based neural architecture search to find better transformer architectures for solving sequence-to-sequence problems.
Zensors releases Car Pose Net API
A 2D pose estimation model for tracking cars.
Audio samples from "Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis"
Results from Capacitron, Google’s latest iteration on the Tacotron speech synthesis model, are very impressive.
Weight Agnostic Neural Networks
A new training technique that searches for model architectures that can solve tasks rather than training weights.
How to make a pizza: Learning a compositional layer-based GAN model
Generating images by compositing layers created by GANs.
Techniques to Tackle Overfitting and Achieve Robustness for Donkey Car Neural Network Self-Driving Agent
Informative deep dive into very practical techniques to achieve better performance for self-driving RC cars.
[Video] Cats, Rats, A.I., Oh My! - Ben Hamm
Training a model to detect a cat trying to bring prey into the house.
Libraries & Code
A very early alpha for the Keras hyperparameter tuning project is up.
Facebook: Open-sourcing AI Habitat, an advanced simulation platform for embodied AI research
Facebook open sources tools to create photo-realistic environments that can be used to train agents via reinforcement learning.
A PyTorch implementation of "Robust Universal Neural Vocoding"
Papers & Publications
Text2Scene: Generating Compositional Scenes from Textual Descriptions
Abstract: In this paper, we propose Text2Scene, a model that generates various forms of compositional scene representations from natural language descriptions. Unlike recent works, our method does NOT use Generative Adversarial Networks (GANs). Text2Scene instead learns to sequentially generate objects and their attributes (location, size, appearance, etc) at every time step by attending to different parts of the input text and the current status of the generated scene. We show that under minor modifications, the proposed framework can handle the generation of different forms of scene representations, including cartoon-like scenes, object layouts corresponding to real images, and synthetic images….
Abstract: ….We describe an unsupervised version of capsule networks, in which a neural encoder, which looks at all of the parts, is used to infer the presence and poses of object capsules. The encoder is trained by backpropagating through a decoder, which predicts the pose of each already discovered part using a mixture of pose predictions. The parts are discovered directly from an image, in a similar manner, by using a neural encoder, which infers parts and their affine transformations. The corresponding decoder models each image pixel as a mixture of predictions made by affine-transformed parts. We learn object- and their part-capsules on unlabeled data, and then cluster the vectors of presences of object capsules…