Deep Learning Weekly: Issue #247

Google at ICLR 2022, load testing and monitoring AI Platform models, why spectral normalization stabilizes GANs, a paper on bootstrapped meta-learning, and many more

Apr 27, 2022

Hey Folks,

This week in deep learning, we bring you Google at ICLR 2022, load testing and monitoring AI Platform models, why spectral normalization stabilizes GANs, and a paper on bootstrapped meta-learning.

You may also enjoy a Singaporean AI lab for national challenges , a compilation of MLOps resources, a guide to iteratively tuning GNNs, a paper on autoregressive search engines, and more.

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!

Industry

Google at ICLR 2022

Google shares its list of invited talks, publications, and workshops for the 10th International Conference on Learning Representations (ICLR 2022) kicking off this week.

SMU, A*Star launch lab that aims to tackle national issues using AI

The Singapore Management University and the Agency for Science, Technology, and Research (A*Star) just launched a new lab which will apply artificial intelligence technology to tackle high-priority national challenges such as aging population and polarization.

Tooth Tech: AI Takes Bite Out of Dental Slide Misses by Assisting Doctors

Pearl, a West Hollywood startup, provides AI for dental images to assist in diagnosis. It landed FDA clearance last month, the first to get such a go-ahead for dentistry AI.

RelationalAI wants to change the way intelligent apps are built

RelationalAI, a startup that combines a database with a knowledge graph, announced that it has raised a $75 million Series B funding led by Tiger Global.

A smarter way to develop new drugs

A new drug development approach from MIT researchers constrains a machine learning model so it only suggests molecular structures that can be synthesized.

MLOps

10 Awesome Resources for Learning MLOps

10 free resources, including books, repositories, courses, and YouTube channels, to help you start your MLOps learning journey.

Reducing Pipeline Debt With Great Expectations

A technical article that introduces pipeline debt and goes through Great Expectation’s most useful features as well as a couple of clever use cases for the automated testing package.

Fine-tune and deploy a Wav2Vec2 model for speech recognition with Hugging Face and Amazon SageMaker

This post shows how to use SageMaker to easily fine-tune the latest Wav2Vec2 model from Hugging Face, and then deploy the model with a custom-defined inference process to a SageMaker managed inference endpoint.

Load testing and monitoring AI Platform models

This document shows you how to test and monitor the online serving performance of machine learning (ML) models that are deployed to AI Platform Prediction.

Inside Meta's AI optimization platform for engineers across the company

Meta built an end-to-end AI platform called Looper, with easy-to-use APIs for optimization, personalization, and feedback collection.

Learning

Deep Learning Techniques you Should Know in 2022

An overview of the most common and most useful deep learning techniques.

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements

An article discussing one of the works of Carnegie Mellon University: the proof that spectral normalization controls two well-known failure modes of GAN training stability: exploding and vanishing gradients.

Pix2Seq: A New Language Interface for Object Detection

Google AI presents a simple and generic method that tackles object detection from a completely different perspective, achieving impressive empirical performance on the widely used COCO dataset.

Getting Started with Transformers on Habana Gaudi

A technical article showing how to quickly set up a Habana Gaudi instance on Amazon Web Services and fine-tune a BERT model for text classification.

Guide to Iteratively Tuning GNNs

This blog walks through the process of experimenting with hyperparameters, training algorithms, and other parameters of Graph Neural Networks.

Libraries & Code

fastmachinelearning/hls4ml

A package for machine learning inference in FPGAs.

sebastianruder/NLP-progress

A repository that aims to track the progress in Natural Language Processing (NLP) and gives an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.

Papers & Publications

Bootstrapped Meta-Learning

Abstract:

Meta-learning empowers artificial intelligence to increase its efficiency by learning how to learn. Unlocking this potential involves overcoming a challenging meta-optimization problem. We propose an algorithm that tackles this problem by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimizes the meta-learner by minimizing the distance to that target under a chosen (pseudo-)metric. Focusing on meta-learning with gradients, we establish conditions that guarantee performance improvements and show that metric can be used to control meta-optimization. Meanwhile, the bootstrapping mechanism can extend the effective meta-learning horizon without requiring backpropagation through all updates. We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark and demonstrate that it yields both performance and efficiency gains in multi-task meta-learning. Finally, we explore how bootstrapping opens up new possibilities and find that it can meta-learn efficient exploration in an epsilon-greedy Q-learning agent - without backpropagating through the update rule.

Autoregressive Search Engines: Generating Substrings as Document Identifiers

Abstract:

Knowledge-intensive language tasks require NLP systems to both provide the correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive language models are emerging as the de-facto standard for generating answers, with newer and more powerful systems emerging at an astonishing pace. In this paper we argue that all this (and future) progress can be directly applied to the retrieval problem with minimal intervention to the models' architecture. Previous work has explored ways to partition the search space into hierarchical structures and retrieve documents by autoregressively generating their unique identifier. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers. This setup allows us to use an autoregressive model to generate and score distinctive ngrams, that are then mapped to full passages through an efficient data structure. Empirically, we show this not only outperforms prior autoregressive approaches but also leads to an average improvement of at least 10 points over more established retrieval solutions for passage-level retrieval on the KILT benchmark, establishing new state-of-the-art downstream performance on some datasets, while using a considerably lighter memory footprint than competing systems.

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

Abstract:

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI). These CNN-based methods achieve impressive restoration performance while showing limitations in capturing the long-range dependencies and self-similarity prior. To cope with this problem, we propose a novel Transformer-based method, Multi-stage Spectral-wise Transformer (MST++), for efficient spectral reconstruction. In particular, we employ Spectral-wise Multi-head Self-attention (S-MSA) that is based on the HSI spatially sparse while spectrally self-similar nature to compose the basic unit, Spectral-wise Attention Block (SAB). Then SABs build up Single-stage Spectral-wise Transformer (SST) that exploits a U-shaped structure to extract multi-resolution contextual information. Finally, our MST++, cascaded by several SSTs, progressively improves the reconstruction quality from coarse to fine. Comprehensive experiments show that our MST++ significantly outperforms other state-of-the-art methods. In the NTIRE 2022 Spectral Reconstruction Challenge, our approach won the First place.

A guest post by

Miko Planas

~~~

Deep Learning Weekly

Discussion about this post

Ready for more?