Deep Learning Weekly Issue #142
TF Dev Summit, embedded AI soccer coaches, 3D pose estimation on mobile, Lagrangian Neural Networks and more...
We hope this issue of Deep Learning Weekly finds you safe and healthy.
The global response to the COVID-19 pandemic has been both inspiring and heartbreaking. It’s impact is felt in every community, including ours, where students and researchers have been displaced from their homes and seen their academic years cut short. For our part, we want to help amplify the voices of people who are still doing amazing work.
If you are a student or a researcher whose research has been disrupted, either because of a conference getting canceled or being displaced from school, DM us on Twitter (@dl_weekly) with a tweet describing your work along with a few figures and we’ll retweet.
As we practice social distancing ourselves, we bring you highlights from the TensorFlow Dev Summit 2020, a neuromorphic chip that can classify odors, an AI ethics checklist, and plans to license cashierless stores from Amazon.
You may also enjoy an AI powered soccer coaching shoe from Google and Adidas, face and hand tracking in TFJS, a look at Langrangian neural networks, AI music source separation from Facebook, speech synthesis from Microsoft, and more!
Until next week!
TensorFlow Dev Summit 2020: Livestream Highlights
A nice round up of the (all virtual) TensorFlow Dev Summit.
Navigating the New Landscape of AI Platforms
A nice summary of some of the newer ML and AI platforms available.
Intel trains neuromorphic chip to detect 10 different odors
You might be able to train smell classifiers in the near future.
[PDF] Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI
An ethics checklist of for AI practitioners.
Amazon to sell its automated checkout technology to third-party retailers
Amazon looks to license the technology behind it’s cashierless “Go” stores.
Mobile + Edge
Real-Time 3D Object Detection on Mobile Devices with MediaPipe
Through a mix of synthetic and real data, a new MediaPipe model estimates 3D object positioning from 2D images.
Google’s New Shoe Insole Analyzes Your Soccer Moves
Google embeds a microprocessor into insoles and uses ML models to track metrics like “shot power”.
Face and hand tracking in the browser with MediaPipe and TensorFlow.js
Google brings 3D face mesh tracking and 2D hand pose tracking to TFJS via MediaPipe.
Train a TinyML model that can recognize sounds using only 23 kB of RAM
A nice tutorial on training an audio classification model for embedded devices with EdgeImpulse.
Higher accuracy on vision models with EfficientNet-Lite
Google’s EfficientNet architectures get tailored specifically for mobile use with TFLite.
OpenAI has released an extremely lite inference engine, written in C with support for Cortex-M and RTOS as well.
Zoom In: An Introduction to Circuits
A fascinating look at feature maps as composible circuits.
Neural networks that can learn Lagrangian functions and conservation laws straight from pixel data.
Announcing TensorFlow Quantum: An Open Source Library for Quantum Machine Learning
In case you have a quantum computer lying around and would like to do machine learning with it, the TensorFlow team has released a module for you.
One-track minds: Using AI for music source separation
Facebook AI Research creates a model that can separate audio files into separate tracks (i.e. bass, drums, vocals, etc.).
From PyTorch to JAX: towards neural net frameworks that purify stateful code
An excellent look into why JAX is appealing for many applications.
[Reddit] Advanced courses update
The list of advanced ML and deep learning courses on the r/MachineLearning sidebar has been updated.
Libraries & Code
Implementations of few-shot object detection benchmarks
Generating speech in a single forward pass without any attention.
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations by T. Chen et al.
An open database of COVID-19 cases with chest X-ray or CT images.
Papers & Publications
StyleGAN2 Distillation for Feed-forward Image Manipulation
Abstract: StyleGAN2 is a state-of-the-art network in generating realistic images. Besides, it was explicitly trained to have disentangled directions in latent space, which allows efficient image manipulation by varying latent factors. Editing existing images requires embedding a given image into the latent space of StyleGAN2. Latent code optimization via backpropagation is commonly used for qualitative embedding of real world images, although it is prohibitively slow for many applications. We propose a way to distill a particular image manipulation of StyleGAN2 into image-to-image network trained in paired way. The resulting pipeline is an alternative to existing GANs, trained on unpaired data. We provide results of human faces' transformation: gender swap, aging/rejuvenation, style transfer and image morphing. We show that the quality of generation using our method is comparable to StyleGAN2 backpropagation and current state-of-the-art methods in these particular tasks.
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
Abstract ....In this paper, we first show that pre-training remains important in the context of smaller architectures, and fine-tuning pre-trained compact models can be competitive to more elaborate methods proposed in concurrent work. Starting with pre-trained compact models, we then explore transferring task knowledge from large fine-tuned models through standard knowledge distillation. The resulting simple, yet effective and general algorithm, Pre-trained Distillation, brings further improvements. Through extensive experiments, we more generally explore the interaction between pre-training and distillation under two variables that have been under-studied: model size and properties of unlabeled task data. One surprising observation is that they have a compound effect even when sequentially applied on the same data. To accelerate future research, we will make our 24 pre-trained miniature BERT models publicly available.