Deep Learning Weekly Issue #162
5 factors pushing AI to the edge, OpenAI's pricing plans, pose estimation for transparent objects, and more
This week in deep learning we bring you the five factors pushing AI and to the Edge, OpenAI's pricing plans for its API (with GPT-3 access), this article on using heartbeat detection to identify deepfake videos, and Google AI's KeyPose: Estimating the 3D Pose of Transparent Objects from Stereo.
You may also enjoy this new flow-based video completion algorithm, this post on supernovae detection using CNNs, this Pytorch implementation of Google Brain's WaveGrad vocoder, this post about predicting traffic with graph neural networks, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
This know-it-all AI learns by reading the entire web nonstop
Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages.
OpenAI reveals the pricing plans for its API — and it ain't cheap
The API gives access to the mighty GPT-3 language model
AI researchers use heartbeat detection to identify deepfake videos
As threats of election interference mount, two teams of AI researchers have recently introduced novel approaches to identifying deepfakes by watching for evidence of heartbeats.
First ‘Plug and Play’ Brain Prosthesis Demonstrated in Paralyzed Person
Stable recordings let a brain and ML system build a ‘partnership’ over time.
A biometric surveillance state is not inevitable, says AI Now Institute
In a new report called “Regulating Biometrics: Global Approaches and Urgent Questions,” the AI Now Institute says regulation advocates are beginning to believe a biometric surveillance state is not inevitable.
Mobile + Edge
AI and Vision at the Edge
This article covers five factors pushing AI to the edge: bandwidth, latency, economics, reliability, and privacy.
Arm's latest Cortex-R82 chip aims to enable smarter storage hardware
Arm Ltd. announced the Cortex-R82, a chip designed to enable a new generation of storage devices that will not only hold data but also help process it.
Global Shipments of TinyML Devices to Reach 2.5 Billion by 2030
Industrial and Manufacturing, Smart Cities, and Consumer Applications are driving the need for Tiny Machine Learning chipsets.
Traffic prediction with advanced Graph Neural Networks
Researchers at DeepMind have partnered with the Google Maps team to improve the accuracy of real time ETAs by up to 50% by using advanced machine learning techniques including Graph Neural Networks.
The Technology Behind our Recent Improvements in Flood Forecasting
This post describes the models powering Google’s flood forecasting technology in India and Bangladesh.
KeyPose: Estimating the 3D Pose of Transparent Objects from Stereo
KeyPose is an ML system that estimates the depth of transparent objects by directly predicting 3D keypoints.
Learning to Summarize with Human Feedback
OpenAI applied reinforcement learning from human feedback to train language models that are better at summarization.
Fast Supernovae Detection using Neural Networks
This post shows how CNNs can help find supernova explosions quickly so they can be studied before the explosion ends.
Search trends dataset of COVID-19 symptoms
This aggregated, anonymized dataset shows trends in search patterns for symptoms and is intended to help researchers to better understand the impact of COVID-19.
Libraries & Code
PyTorch implementation of Google Brain's WaveGrad vocoder.
Data and implementation of Facebook’s KILT: a Benchmark for Knowledge Intensive Language Tasks.
Papers & Publications
Flow-edge Guided Video Completion
Abstract: We present a new flow-based video completion algorithm. Previous flow completion methods are often unable to retain the sharpness of motion boundaries. Our method first extracts and completes motion edges, and then uses them to guide piecewise-smooth flow completion with sharp edges. Existing methods propagate colors among local flow connections between adjacent frames. However, not all missing regions in a video can be reached in this way because the motion boundaries form impenetrable barriers. Our method alleviates this problem by introducing non-local flow connections to temporally distant frames, enabling propagating video content over motion boundaries. We validate our approach on the DAVIS dataset. Both visual and quantitative results show that our method compares favorably against the state-of-the-art algorithms.
Learning to summarize from human feedback
Abstract: As language models become more powerful, training and evaluation are increasingly bottlenecked by the data and metrics used for a particular task. For example, summarization models are often trained to predict human reference summaries and evaluated using ROUGE, but both of these metrics are rough proxies for what we really care about---summary quality. In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning. We apply our method to a version of the TL;DR dataset of Reddit posts and find that our models significantly outperform both human reference summaries and much larger models fine-tuned with supervised learning alone. Our models also transfer to CNN/DM news articles, producing summaries nearly as good as the human reference without any news-specific fine-tuning. We conduct extensive analyses to understand our human feedback dataset and fine-tuned models. We establish that our reward model generalizes to new datasets, and that optimizing our reward model results in better summaries than optimizing ROUGE according to humans. We hope the evidence from our paper motivates machine learning researchers to pay closer attention to how their training loss affects the model behavior they actually want.