Deep Learning Weekly Issue #147
Facebook's real-time text-to-speech on CPUs, training neural nets w/out gradients, Sony's AI image sensor, & more
|Matthew Moellman||May 20, 2020|
This week in deep learning, we bring you a real-time text-to-speech system that runs on CPUs from Facebook, a computer vision system for shopping from Facebook, smart cords that use machine learning to detect gestures from Google, and more!
Some neural network training developments include a new method for training neural networks without gradients and faster neural network training using data echoing.
You may also enjoy some edge hardware developments such as Sony's AI image sensor, the new NVIDIA EGX A100 used in edge servers, or the collaboration between Eta Compute and Edge Impulse to develop low-power edge AI capabilities.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Clara Guardian applies computer vision and other AI technologies to video feeds and sensor data to make hospitals safer.
A Greenpeace report details Silicon Valley’s ties to Big Oil and spurs Google to take a step toward opting out.
Sensors embedded in cords combined with AI enable the use of gestures to control devices.
Facebook developed GrokNet, a computer vision system used to enrich the shopping experience on Facebook Marketplace.
Mobile + Edge
Eta Compute and Edge Impulse collaborate to develop machine learning capabilities for IoT devices with limited battery capacity.
Sony’s new IMX500 image sensor will be able to execute computer vision tasks on-device, with no additional hardware required.
The new NVIDIA EGX A100 can power fast, efficient, and secure edge AI systems.
This tutorial introduces you to Core ML and Vision, two cutting-edge iOS frameworks, and how to fine-tune a model on the device.
This 60-day learning plan will help jumpstart your deep learning career.
Facebook AI has built and deployed a real-time neural text-to-speech system on CPU servers, delivering industry-leading compute efficiency and human-level audio quality.
This dataset is composed of 10 publicly available datasets of natural images (including ImageNet, CUB-200-2011, Fungi, etc.), handwritten characters and doodles.
Libraries & Code
A robust Python tool for text-based AI training and generation using GPT-2 written by Max Woolf, Data Scientist @buzzfeed.
Fit interpretable machine learning models. Explain black box machine learning.
Papers & Publications
Abstract: In the twilight of Moore’s law, GPUs and other specialized hardware accelerators have dramatically sped up neural network training. However, earlier stages of the training pipeline, such as disk I/O and data preprocessing, do not run on accelerators. As accelerators continue to improve, these earlier stages will increasingly become the bottleneck. In this paper, we introduce “data echoing,” which reduces the total computation used by earlier pipeline stages and speeds up training whenever computation upstream from accelerators dominates the training time. Data echoing reuses (or “echoes”) intermediate outputs from earlier pipeline stages in order to reclaim idle capacity. We investigate the behavior of different data echoing algorithms on various workloads, for various amounts of echoing, and for various batch sizes. We find that in all settings, at least one data echoing algorithm can match the baseline’s predictive performance using less upstream computation. We measured a factor of 3.25 decrease in wall-clock time for ResNet-50 on ImageNet when reading training data over a network.
Abstract: With the growing importance of large network models and enormous training datasets, GPUs have become increasingly necessary to train neural networks. This is largely because conventional optimization algorithms rely on stochastic gradient methods that don’t scale well to large numbers of cores in a cluster setting. Furthermore, the convergence of all gradient methods, including batch methods, suffers from common problems like saturation effects, poor conditioning, and saddle points. This paper explores an unconventional training method that uses alternating direction methods and Bregman iteration to train networks without gradient descent steps. The proposed method reduces the network training problem to a sequence of minimization substeps that can each be solved globally in closed form. The proposed method is advantageous because it avoids many of the caveats that make gradient methods slow on highly non-convex problems. The method exhibits strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.