Deep Learning Weekly Issue #114
XLNet beats BERT, Recommendation at VSCO, Neural Code Search from Facebook, and more...
This week in deep learning we bring you a theft-detection system from Walmart, a look at VSCO’s ML-powered filter recommendations, an update to MLPerf, and a new Raspberry Pi with a 6 core GPU.
You may also enjoy a trio of projects from Facebook (CNN trained on a billion Instagram photos, a picture-to-recipe generator, and neural code search), an ICML tutorial on population-search in deep learning, and a helpful list of data annotation tools.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Walmart will use deep learning models to look for unscanned items and for checkout.
The next iteration of Raspberry Pi has been announced and comes with a 6 core GPU.
How the VSCO camera app team uses on-device deep learning to suggest the best filter for a photo.
Five new datasets for three tasks (image recognition, object detection, and machine translation) have been released to help benchmark deep learning models.
Artist Helena Sarin describes her process of generating images with GANs.
An in-depth look at evolutionary approaches to deep learning.
Excellent summary of the use cases and human potential of AI in Africa.
A ResNet model was pre-trained on Instragram data.
A CNN is used to extract ingredients from an image, and ingredients are fed into a Transform that outputs a sequence representing the recipe.
A new tool called Neural Code Search finds blocks of code related to queries made in natural language.
A look at deep learning over the past 4 decades and where we’re heading.
Libraries & Code
A long list of tools for annotating data.
A PyTorch implementation of FaceNet for facial recognition.
An open source object detection toolbox based on PyTorch.
Papers & Publications
Abstract: With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation….
Abstract: ….We propose a highly effective unsupervised generative adversarial network, dubbed EnlightenGAN, that can be trained without low/normal-light image pairs, yet proves to generalize very well on various real-world test images. Instead of supervising the learning using ground truth data, we propose to regularize the unpaired training using the information extracted from the input itself, and benchmark a series of innovations for the low-light image enhancement problem, including a global-local discriminator structure, a self-regularized perceptual loss fusion, and attention mechanism…