Deep Learning Weekly Issue #143

Acquisitions by Ikea and Niantic, Huawei's AI framework, Facebook's RegNet model, Core ML + TFLite and more...

Hey folks,

This week in deep learning we bring you acquisitions by Ikea and Niantic, a PyTorch / TensorFlow competitor from Huawei, a look at TensorFlow Lite’s new Core ML delegate, and an impressive new background matting technique from researchers at the University of Washington.

You may also enjoy Facebook’s new GPU optimized RegNet architecture, AutoAugment at Waymo, a PyTorch implementation of lossless image compression with super-resolution, a new method for simultaneous object detection and tracking, using neural radiance fields for view synthesis, a great webinar on tinyML, and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


AI Startups Cut Staff as Coronavirus Slams Economy [Paywall]

Coronavirus has decimated many industries. AI startups in the early days of revenue growth are particularly vulnerable right now.

Artificial intelligence is quietly disrupting the fragrance development process

An interesting application of deep learning to identify candidate fragrances for perfume makers.

Ikea acquires AI imaging startup Geomagical Labs to supercharge room visualisations

Retailers are continuing to invest in new technologies to help consumers try and buy items.

Niantic squares up against Apple and Facebook with acquisition of AR startup

Niantic (of Pokemon Go fame) has acquired, an AR startup with technology for rapidly building 3D representations of physical spaces.

Huawei open-sources MindSpore, a framework for AI app development

Huawei open-sources a PyTorch / TensorFlow competitor that is (naturally) optimized for their hardware.

Mobile + Edge

TensorFlow Lite Core ML delegate enables faster inference on iPhones and iPads

TensorFlow Lite will soon gain access to the Apple Neural Engine on iOS devices via a Core ML delegate.

tinyML Talks webcast: Getting started with TinyML

Pete Warden gives a nice introduction to tinyML, running ML models on embedded devices.

Upcoming changes to TensorFlow.js

TFJS is re-organizing backends to make deploys smaller.

GPU-Accelerated Mobile Multi-view Style Transfer

A new style transfer training method produces results suitable for multi-view applications like VR.


Dive into Deep Learning [Book]

Dive into Deep Learning: an interactive deep learning book on Jupyter notebooks, using the NumPy interface.

Using automated data augmentation to advance our Waymo Driver

Waymo improves car perception by using AutoAugment during model training.

Facebook AI RegNet Models Outperform EfficientNet Models, Run 5x Faster on GPUs

New research out of Facebook finds a much faster architecture using a mix of bespoke and computer aided network design.

Libraries & Code

[GitHub] caoscott/SReC

PyTorch Implementation of "Lossless Image Compression through Super-Resolution"

[GitHub] xingyizhou/CenterTrack

Simultaneous object detection and tracking using center points.

[GitHub] kwotsin/mimicry

A PyTorch library for the reproducibility of GAN research.

[GitHub] szymonmaszke/torchlayers

Keras-like shape inference for PyTorch layers.

Papers & Publications

Background Matting: The World is Your Green Screen

Abstract: We propose a method for creating a matte – the per-pixel foreground color and alpha – of a person by taking photos or videos in an everyday setting with a handheld camera. Most existing matting methods require a green screen background or a manually created trimap to produce a good matte. Automatic, trimap-free methods are appearing, but are not of comparable quality. In our trimap free approach, we ask the user to take an additional photo of the background without the subject at the time of capture. This step requires a small amount of foresight but is far less time consuming than creating a trimap. We train a deep network with an adversarial loss to predict the matte. We first train a matting network with supervised loss on ground truth data with synthetic composites. To bridge the domain gap to real imagery with no labeling, we train another matting network guided by the first network and by a discriminator that judges the quality of composites. We demonstrate results on a wide variety of photos and videos and show significant improvement over the state of the art.

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location (x,y,z) and viewing direction (θ,ϕ)) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays and use classic volume rendering techniques to project the output colors and densities into an image. Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses. We describe how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrate results that outperform prior work on neural rendering and view synthesis. View synthesis results are best viewed as videos, so we urge readers to view our supplementary video for convincing comparisons.