Deep Learning Weekly: Issue #247
Google at ICLR 2022, load testing and monitoring AI Platform models, why spectral normalization stabilizes GANs, a paper on bootstrapped meta-learning, and many more
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Google shares its list of invited talks, publications, and workshops for the 10th International Conference on Learning Representations (ICLR 2022) kicking off this week.
The Singapore Management University and the Agency for Science, Technology, and Research (A*Star) just launched a new lab which will apply artificial intelligence technology to tackle high-priority national challenges such as aging population and polarization.
Pearl, a West Hollywood startup, provides AI for dental images to assist in diagnosis. It landed FDA clearance last month, the first to get such a go-ahead for dentistry AI.
RelationalAI, a startup that combines a database with a knowledge graph, announced that it has raised a $75 million Series B funding led by Tiger Global.
A new drug development approach from MIT researchers constrains a machine learning model so it only suggests molecular structures that can be synthesized.
10 free resources, including books, repositories, courses, and YouTube channels, to help you start your MLOps learning journey.
A technical article that introduces pipeline debt and goes through Great Expectation’s most useful features as well as a couple of clever use cases for the automated testing package.
This post shows how to use SageMaker to easily fine-tune the latest Wav2Vec2 model from Hugging Face, and then deploy the model with a custom-defined inference process to a SageMaker managed inference endpoint.
This document shows you how to test and monitor the online serving performance of machine learning (ML) models that are deployed to AI Platform Prediction.
Meta built an end-to-end AI platform called Looper, with easy-to-use APIs for optimization, personalization, and feedback collection.
An overview of the most common and most useful deep learning techniques.
An article discussing one of the works of Carnegie Mellon University: the proof that spectral normalization controls two well-known failure modes of GAN training stability: exploding and vanishing gradients.
Google AI presents a simple and generic method that tackles object detection from a completely different perspective, achieving impressive empirical performance on the widely used COCO dataset.
A technical article showing how to quickly set up a Habana Gaudi instance on Amazon Web Services and fine-tune a BERT model for text classification.
This blog walks through the process of experimenting with hyperparameters, training algorithms, and other parameters of Graph Neural Networks.
Libraries & Code
A package for machine learning inference in FPGAs.
A repository that aims to track the progress in Natural Language Processing (NLP) and gives an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.
Papers & Publications
Meta-learning empowers artificial intelligence to increase its efficiency by learning how to learn. Unlocking this potential involves overcoming a challenging meta-optimization problem. We propose an algorithm that tackles this problem by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimizes the meta-learner by minimizing the distance to that target under a chosen (pseudo-)metric. Focusing on meta-learning with gradients, we establish conditions that guarantee performance improvements and show that metric can be used to control meta-optimization. Meanwhile, the bootstrapping mechanism can extend the effective meta-learning horizon without requiring backpropagation through all updates. We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark and demonstrate that it yields both performance and efficiency gains in multi-task meta-learning. Finally, we explore how bootstrapping opens up new possibilities and find that it can meta-learn efficient exploration in an epsilon-greedy Q-learning agent - without backpropagating through the update rule.
Knowledge-intensive language tasks require NLP systems to both provide the correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive language models are emerging as the de-facto standard for generating answers, with newer and more powerful systems emerging at an astonishing pace. In this paper we argue that all this (and future) progress can be directly applied to the retrieval problem with minimal intervention to the models' architecture. Previous work has explored ways to partition the search space into hierarchical structures and retrieve documents by autoregressively generating their unique identifier. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers. This setup allows us to use an autoregressive model to generate and score distinctive ngrams, that are then mapped to full passages through an efficient data structure. Empirically, we show this not only outperforms prior autoregressive approaches but also leads to an average improvement of at least 10 points over more established retrieval solutions for passage-level retrieval on the KILT benchmark, establishing new state-of-the-art downstream performance on some datasets, while using a considerably lighter memory footprint than competing systems.
Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI). These CNN-based methods achieve impressive restoration performance while showing limitations in capturing the long-range dependencies and self-similarity prior. To cope with this problem, we propose a novel Transformer-based method, Multi-stage Spectral-wise Transformer (MST++), for efficient spectral reconstruction. In particular, we employ Spectral-wise Multi-head Self-attention (S-MSA) that is based on the HSI spatially sparse while spectrally self-similar nature to compose the basic unit, Spectral-wise Attention Block (SAB). Then SABs build up Single-stage Spectral-wise Transformer (SST) that exploits a U-shaped structure to extract multi-resolution contextual information. Finally, our MST++, cascaded by several SSTs, progressively improves the reconstruction quality from coarse to fine. Comprehensive experiments show that our MST++ significantly outperforms other state-of-the-art methods. In the NTIRE 2022 Spectral Reconstruction Challenge, our approach won the First place.