Deep Learning Weekly Issue #136
Acquisitions from Snap and Intel, data augmentation from Google, HuggingFace tokenizers, TF 2.1, and more...
You may also include a summary of trends from NeurIPS 2019, transformers learn to play chess, AI applications to economic research from Amazon, a new augmentation technique from Google, a tokenizer library from HuggingFace, and more.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly
Until next week!
Lyft open sources a model serving and workflow tool.
AI-based creativity tools continue to be a massive value add for social networks increasingly relying on computer vision.
The deepfake wars continue to heat up.
At CES, Samsung debued a computer vision-based keyboard that allows users to type on any flat surface.
Arduino announced a new $99 board designed for AI workloads.
With so many AI chip makers, consolidation is likely.
A new dataset, representation, and model for decomposing actions in a video into structured graph data.
64,000 pictures of cars, labeled by make, model, year, price, horsepower, body style, etc.
This release consolidates CPU and GPU flavors, brings TPU support to Keras, and will be the last major release supporting Python 2.
Transformers learn how to play chess better than amateurs.
Chip Huyen summarizes some of the biggest trends at NeurIPS 2019.
Jeff Dean talks transformers, specialized hardware, and robots.
Amazon applies deep learning to create additional data for economic models.
Libraries & Code
AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty
The model detects faces and facial features in real-time.
HuggingFace introduces fast tokenizers to go with their language models.
Papers & Publications
Abstract: …. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM's hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.
Abstract: We propose a method to realistically insert synthetic objects into existing photographs without requiring access to the scene or any additional scene measurements. With a single image and a small amount of annotation, our method creates a physical model of the scene that is suitable for realistically rendering synthetic objects with diffuse, specular, and even glowing materials while accounting for lighting interactions between the objects and the scene. We demonstrate in a user study that synthetic images produced by our method are confusable with real scenes, even for people who believe they are good at telling the difference. Further, our study shows that our method is competitive with other insertion methods while requiring less scene information. We also collected new illumination and reflectance datasets; renderings produced by our system compare well to ground truth. Our system has applications in the movie and gaming industry, as well as home decorating and user content creation, among others.