Deep Learning Weekly Issue #114

XLNet beats BERT, Recommendation at VSCO, Neural Code Search from Facebook, and more...

Hey folks,

This week in deep learning we bring you a theft-detection system from Walmart, a look at VSCO’s ML-powered filter recommendations, an update to MLPerf, and a new Raspberry Pi with a 6 core GPU.

You may also enjoy a trio of projects from Facebook (CNN trained on a billion Instagram photos, a picture-to-recipe generator, and neural code search), an ICML tutorial on population-search in deep learning, and a helpful list of data annotation tools.

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


Walmart reveals it's tracking checkout theft with AI-powered cameras in 1,000 stores [Business Insider]

Walmart will use deep learning models to look for unscanned items and for checkout.

Raspberry Pi 4 announced along with AI Core

The next iteration of Raspberry Pi has been announced and comes with a 6 core GPU.

Suggesting Presets for Images: Building: “For This Photo” at VSCO

How the VSCO camera app team uses on-device deep learning to suggest the best filter for a photo.

MLPerf introduces machine learning inference benchmark suite [VentureBeat]

Five new datasets for three tasks (image recognition, object detection, and machine translation) have been released to help benchmark deep learning models.


Playing a game of GANstruction

Artist Helena Sarin describes her process of generating images with GANs.

[Video] ICML 2019 Tutorial: Recent Advances in Population-Based Search for Deep Neural Networks

An in-depth look at evolutionary approaches to deep learning.

The future of AI research is in Africa

Excellent summary of the use cases and human potential of AI in Africa.

Facebook: SOTA on ImageNet Top-1 Accuracy

A ResNet model was pre-trained on Instragram data.

Facebook: Using AI to generate recipes from food images

A CNN is used to extract ingredients from an image, and ingredients are fed into a Transform that outputs a sequence representing the recipe.

Facebook: ML-based code search using natural language queries

A new tool called Neural Code Search finds blocks of code related to queries made in natural language.

[Video] Geoffrey Hinton and Yann LeCun to Deliver Turing Lecture

A look at deep learning over the past 4 decades and where we’re heading.

Libraries & Code

Annotation tools for building datasets

A long list of tools for annotating data.

[GitHub] timesler/facenet-pytorch

A PyTorch implementation of FaceNet for facial recognition.

[GitHub] open-mmlab/mmdetection

An open source object detection toolbox based on PyTorch.

Papers & Publications

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Abstract: With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation….

EnlightenGAN: Deep Light Enhancement without Paired Supervision

Abstract: ….We propose a highly effective unsupervised generative adversarial network, dubbed EnlightenGAN, that can be trained without low/normal-light image pairs, yet proves to generalize very well on various real-world test images. Instead of supervising the learning using ground truth data, we propose to regularize the unpaired training using the information extracted from the input itself, and benchmark a series of innovations for the low-light image enhancement problem, including a global-local discriminator structure, a self-regularized perceptual loss fusion, and attention mechanism…