Deep Learning Weekly Issue #119
AI-generated perfume, a new question answering challenge, Lyft's Level 5 dataset, model compression, and more!
This week in deep learning we bring you an AI that creates perfumes, TensorFlow Addons, a new long-form question answering challenge from Facebook, and a GAN that can generate Twitch.tv emotes.
You may also enjoy a new model compression technique from FAIR, a review of semantic segmentation models, Lyft’s Level 5 self-driving car dataset, a Keras implementation of BiGAN, and a method for improving shift invariance in CNNs.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Researchers at the University of Colorado and Duke train a convolutional neural network to predict emotions from selfies.
A machine learning model takes chemical properties and demographic sales data to generate new perfume formulas.
TensorFlow Addons will replace tf.contrib, making it easier for the open-source community to extend TensorFlow functionality beyond core APIs.
Facebook announces a new challenge for long-form question answering including code, data, and baseline models.
Downloading over 2 million emotes from Twitch and training a GAN to produce new ones.
A new compression technique from Facebook shrinks the size of Mask-RCNN 26x while maintaining reasonable accuracy.
A tool for the interactive exploration of Convolutional Neural Networks.
Using RGB-D images and neural networks to render new views of complex scenes.
A review of the popular and state-of-the-art semantic segmentation models.
An open image dataset of waste in the wild.
Lyft releases a massive dataset from its self-driving car fleet including LiDAR and cameras.
Libraries & Code
A reference implementation of a monocular depth model by Niantic Labs.
A Keras implementation of BiGAN to find similar images.
A new set of efficient models produced by Google’s mobile neural architecture search.
Papers & Publications
Abstract: Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling….We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions.
Abstract: The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase from 2012 to 2018. These computations have a surprisingly large carbon footprint....This position paper advocates a practical solution by making efficiency an evaluation criterion for research alongside accuracy and related measures. In addition, we propose reporting the financial cost or "price tag" of developing, training, and running models to provide baselines for the investigation of increasingly efficient methods. Our goal is to make AI both greener and more inclusive---enabling any inspired undergraduate with a laptop to write high-quality research papers…