Deep Learning Weekly Issue #146
Google's reading companion, AI pixel-art, AR + AI cut and paste, guitar effect emulation, and more...
This week in deep learning, we bring you a new app from Google that uses audio recognition to help kids with reading, an state-of-the-art depth model from Facebook, a federated learning project to detect brain tumors from Intel, a partnership between Hailo and Foxconn for AI-specific chips, and an impressive DIY AI stethoscope for under a dollar.
You may also enjoy a podcast discussing the promise of TinyML, an incredible AR + AI project that lets you paste objects from the real world into documents on your laptop, a simple but effective pruning technique from MIT, audio style transfer, and research that suggests state-of-the-art models aren’t all that different after all, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Google Lens now lets users take pictures of hand written notes and copy text directly to your laptop.
A new app from Google uses audio recognition to help kids with reading skills.
Facebook unveiled a new model that estimates high quality, consistent depth maps from 2D images.
Federated learning used by research institutions to train tumor classifiers on disparate data sources to preserve privacy.
Chipmaker Hailo has partnered with Foxconn to produce a new SoC for applying AI to video on edge devices.
Mobile + Edge
A new podcast explores ML on embedded devices with Pete Warden.
A nice exploration of performance versus speed / size tradeoffs (or lack thereof) with TensorFlow Lite.
Google’s on-device speech recognition model now boasts higher accuracy than their own production server-side models.
How to build a digital AI stethoscope with $1 worth of equipment.
New pruning technique from researchers at MIT is both simple and effective. Train, prune, retrain, and repeat.
Impressive audio style transfer project.
Training neural networks to emulate guitar pedal effects.
New pre-trained efficientnet checkpoints are now available on TensorFlow Hub.
New research from Facebook details a system capable of choosing which platform to run model inference on based on available resources and network latency.
Convert selfies to pixel art.
Libraries & Code
Incredible demonstration using AI and AR to cut salient objects out of images and beam them directly into documents on your laptop.
Fun mashup of browser-based pose estimation with TFJS and real-time SVG animation.
Papers & Publications
Abstract: Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods. In this paper, we take a closer look at the field to see if this is actually true. We find flaws in the experimental setup of these papers, and propose a new way to evaluate metric learning algorithms. Finally, we present experimental results that show that the improvements over time have been marginal at best.
Abstract: Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model on a target task. We scale up pre-training, and propose a simple recipe that we call Big Transfer (BiT). By combining a few carefully selected components, and transferring using a simple heuristic, we achieve strong performance on over 20 datasets. BiT performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples. BiT achieves 87.5% top-1 accuracy on ILSVRC-2012, 99.4% on CIFAR-10, and 76.3% on the 19 task Visual Task Adaptation Benchmark (VTAB). On small datasets, BiT attains 76.8% on ILSVRC-2012 with 10 examples per class, and 97.0% on CIFAR-10 with 10 examples per class. We conduct detailed analysis of the main components that lead to high transfer performance.