Deep Learning Weekly Issue #112

Google's EfficientNet, on-device training from Apple, research ethics, drawing faces from voices, and more...

Hey folks,

This week in deep learning we bring you a new efficient, accurate network architecture from Google, an acquisition by Twitter to help fight fake news, an update to on-device ML from Apple, and a disturbing research trend that brings the ethics of AI into sharp focus.

You may also enjoy an overview of graph neural networks in PyTorch, implementing backpropagation in NumPy, a neural network-based drone autopilot running on a microcontroller, depthwise 3D convolutions in Keras, and extremely data efficient network training from DeepMind.

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling [Google]

Researchers at Google have found a way to generate extremely accurate, efficient network architectures achieving state-of-the-art performance with far smaller, faster models.

Apple releases ARKit 3 and Core ML 3 Betas

Apple announced a slew of new ML powered features at WWDC including person occlusion and 3D pose estimation in ARKit as well as on-device training in Core ML 3.

Twitter bags deep learning talent behind London startup, Fabula AI [TechCrunch]

Twitter has acquired Fabula AI, a startup building technology to identify fake news on social platforms.

[Reddit] Has anyone noticed a lot of ML research into facial recognition of Uyghur people lately?

A disturbing and important discussion of ethics in AI research in light of recent human rights abuses against the Uyghur people.


AI Against Humanity

Play cards against humanity with cards generated by a fine-tuned GPT-2 model.


1 million fake faces

A dataset containing 1 million fake portraits generated by StyleGAN.


Hands-on Graph Neural Networks with PyTorch & PyTorch Geometric

An introduction to GNNs with PyTorch.

CNNs, Part 2: Training a Convolutional Neural Network

Implementing a convolutional neural network and backpropagation from scratch with NumPy.

Speech2Face: Learning the Face Behind a Voice

Generating faces of speakers using only audio of their voice.

PULP Dronet: A 27-gram nano-UAV inspired by insects

A tiny drone with a neural network auto-pilot running on a microcontroller.

Libraries & Code

Cold Case: The Lost MNIST Digits

An interesting project to reconstruct 50,000 lost test samples from the MNIST dataset.

[GIthub] alexandrosstergiou/keras-DepthwiseConv3D

Keras with TensorFlow backend implementation for 3D channel-wise convolutions.

Papers & Publications

Data-Efficient Image Recognition with Contrastive Predictive Coding

Abstract: Large scale deep learning excels when labeled images are abundant, yet data-efficient learning remains a longstanding challenge....Our work tackles this challenge with Contrastive Predictive Coding, an unsupervised objective which extracts stable structure from still images. The result is a representation which, equipped with a simple linear classifier, separates ImageNet categories better than all competing methods, and surpasses the performance of a fully-supervised AlexNet model. When given a small number of labeled images (as few as 13 per class), this representation retains a strong classification performance, outperforming state-of-the-art semi-supervised methods by 10% Top-5 accuracy and supervised methods by 20%....

An Explicitly Relational Neural Network Architecture

Abstract: With a view to bridging the gap between deep learning and symbolic AI, we present a novel end-to-end neural network architecture that learns to form propositional representations with an explicitly relational structure from raw pixel data. In order to evaluate and analyse the architecture, we introduce a family of simple visual relational reasoning tasks of varying complexity. We show that the proposed architecture, when pre-trained on a curriculum of such tasks, learns to generate reusable representations that better facilitate subsequent learning on previously unseen tasks when compared to a number of baseline architectures. The workings of a successfully trained model are visualised to shed some light on how the architecture functions.