Deep Learning Weekly Issue #150
IBM's stance on facial recognition, a new SOTA object detection architecture, ML performance on mobile, and more
This week in deep learning we bring you a state-of-the-art object detection paper (with code), Facebook’s TransCoder AI that converts code from one programming language into another, and a tool for deciding how big your language model should be.
You may also enjoy this post on exploration strategies in Reinforcement Learning, this deep learning API, or Uber's interface for running deep learning models from multiple frameworks.
For some deep learning content related to current events, check out this article on IBM's stance on facial recognition, and this facial blur tool in the Signal messaging app.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Facebook researchers this week introduced Situated Interactive MultiModal Conversations, a novel research direction aimed at training AI chatbots that take actions like showing an object and explaining what it’s made of in response to images, memories of previous interactions, and individual requests.
IBM Corp. CEO Arvind Krishna is withdrawing his company from the general-purpose facial recognition market over concerns that the technology is being used to promote discrimination and racial injustice.
Researchers at Google have developed a model to separate background noise from speech for their video chat product.
Uber’s autonomous driving group open-sourced Neuropod, a technology designed to reduce the amount of coding enterprise developers have to do to build and deploy artificial intelligence models.
Facebook researchers have developed a neural transcompiler - a system that converts code from one high-level programming language like C++, Java, and Python into another.
Mobile + Edge
The messaging app Signal introduced a new face-blurring feature to protect people’s privacy.
Machine learning on small IoT devices presents several challenges to designers: power consumption, latency, and accuracy.
Mobile apps for measuring neural network computational performance differ in the types of models and optimizations they use, so their results and relevance may vary.
Google AI researchers designed a pre-training self-supervised objective (called gap-sentence generation) for Transformer encoder-decoder models to improve fine-tuning performance on abstractive summarization, achieving state-of-the-art results on 12 diverse summarization datasets.
Exploitation versus exploration is a critical topic in reinforcement learning. This post introduces several common approaches for better exploration in Deep RL.
One surprising scaling effect of deep learning is that bigger neural networks are actually compute-efficient. Given a training budget, this tool determines how big a model should be.
Libraries & Code
Code for the paper titled DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution.
Libra is a deep learning API that allows users to use machine learning in their workflows in fluent one-liners.
Code for the paper titled PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization.
Papers & Publications
Abstract: Many modern object detectors demonstrate outstanding performances by using the mechanism of looking and thinking twice. In this paper, we explore this mechanism in the backbone design for object detection. At the macro level, we propose Recursive Feature Pyramid, which incorporates extra feedback connections from Feature Pyramid Networks into the bottom-up backbone layers. At the micro level, we propose Switchable Atrous Convolution, which convolves the features with different atrous rates and gathers the results using switch functions. Combining them results in DetectoRS, which significantly improves the performances of object detection. On COCO test-dev, DetectoRS achieves state-of-the-art 54.7% box AP for object detection, 47.1% mask AP for instance segmentation, and 49.6% PQ for panoptic segmentation. The code is made publicly available.
Abstract: A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port codebases written in an obsolete or deprecated language (e.g. COBOL, Python 2) to a modern one. They typically rely on handcrafted rewrite rules, applied to the source code abstract syntax tree. Unfortunately, the resulting translations often lack readability, fail to respect the target language conventions, and require manual modifications in order to work properly. The overall translation process is time-consuming and requires expertise in both the source and target languages, making code-translation projects expensive. Although neural models significantly outperform their rule-based counterparts in the context of natural language translation, their applications to transcompilation have been limited due to the scarcity of parallel data in this domain. In this paper, we propose to leverage recent approaches in unsupervised machine translation to train a fully unsupervised neural transcompiler. We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy. Our method relies exclusively on monolingual source code, requires no expertise in the source or target languages, and can easily be generalized to other programming languages. We also build and release a test set composed of 852 parallel functions, along with unit tests to check the correctness of translations. We show that our model outperforms rule-based commercial baselines by a significant margin.