Deep Learning Weekly Issue #130

PyTorch 1.3, few-shot video synthesis, benchmarking transfomers, Bayesian optimizations to NAS and more...

Hey folks,

We’re back after a few weeks of travel to bring you PyTorch 1.3 release, BERT in Google Search, a set-top box from Nvidia that upscales video to 4K, and general availability for Coral edge TPUs.

You may also enjoy a look at AI advances in 1900, a new few-shot video synthesis model, benchmarks for transformers, a new detectron library, and a novel application of Bayesian optimization to neural architecture search.

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


PyTorch 1.3 adds mobile, privacy, quantization, and named tensors

In case you missed it, PyTorch 1.3 is out with a whole bunch of new and interesting features.

Google is improving 10 percent of searches by understanding language context [The Verge]

NLP is making its way into production with BERT powering Google Search.

Solving a Rubick’s Cube with a Robot Hand [OpenAI]

New research from OpenAI using a deep learning model to manipulate a Rubick’s cube in a robot hand. The cube is solved with traditional techniques, but the physical movement happens via DL.

Google’s Coral AI edge hardware launches out of beta [VentureBeat]

Edge TPUs are now generally available, vastly speeding up models deployed in IoT devices.

Nvidia's Newest Set-Top Box Uses AI to Turn Content 4K [Gizmodo]

A super-resolution model in Nvidia’s Shield TV upscales 720p content all the way up to 4K.


Deep Learning: Our Miraculous Year 1990-1991

A fantastic look at many foundational ideas in AI that came from a single year of research nearly 30 years ago.

Few-shot Video-to-Video Synthesis

A novel video synthesis technique from researchers at Nvidia.

Benchmarking Transformers: PyTorch and TensorFlow

The team at Hugging face benchmarks transformers on various platforms and hardware. TL; DR: all else being equal, PyTorch and TF are about the same and GPUs make things fast.

Open-sourcing ReAgent, a modular, end-to-end platform for building reasoning systems

Facebook open sources a suite of tools to make it easier to design models that make or rely on decisions.

HAMR — 3D Hand Shape and Pose Estimation from a Single RGB Image

A deep dive into a state-of-the-art technique to fit a 3D mesh to a hand in a single RGB image.

A new dense, sliding-window technique for instance segmentation

Researchers from Facebook announce TensorMask, a new framework for performing instance segmentation.

Libraries & Code

[Github] facebookresearch/detectron2

Detectron2 is FAIR's next-generation research platform for object detection and segmentation.

[Github] as-ideas/headliner

Easy training and deployment of seq2seq models.

[Github] HasnainRaz/Fast-SRGAN

A Single Image Super Resolution GAN that uses a mobile net architecture as a generator.

[Github] netrasys/pgANN

Fast Approximate Nearest Neighbor (ANN) searches with a PostgreSQL database.

Papers & Publications

Answering Complex Open-domain Questions Through Iterative Query Generation

Abstract: ….We present GoldEn (Gold Entity) Retriever, which iterates between reading context and retrieving more supporting documents to answer open-domain multi-hop questions. Instead of using opaque and computationally expensive neural retrieval models, GoldEn Retriever generates natural language search queries given the question and available context, and leverages off-the-shelf information retrieval systems to query for missing entities. This allows GoldEn Retriever to scale up efficiently for open-domain multi-hop reasoning while maintaining interpretability. We evaluate GoldEn Retriever on the recently proposed open-domain multi-hop QA dataset, HotpotQA, and demonstrate that it outperforms the best previously published model despite not using pretrained language models such as BERT.

BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search

Abstract: ….In this work, we design a NAS algorithm that performs Bayesian optimization using a neural network model. We develop a path-based encoding scheme to featurize the neural architectures that are used to train the neural network model. This strategy is particularly effective for encoding architectures in cell-based search spaces. After training on just 200 random neural architectures, we are able to predict the validation accuracy of a new architecture to within one percent of its true accuracy on average, for popular search spaces. This may be of independent interest beyond Bayesian neural architecture search. We test our algorithm on the NASBench (Ying et al. 2019) and DARTS (Liu et al. 2018) search spaces, and we show that our algorithm outperforms other NAS methods including evolutionary search, reinforcement learning, AlphaX, ASHA, and DARTS. Our algorithm is over 100x more efficient than random search, and 3.8x more efficient than the next-best algorithm on the NASBench dataset....