Deep Learning Weekly: Issue #271

Google's Imagen Video, A/B Testing with Kubernetes and Seldon Core, Quantum Convolutional Neural Networks, a paper on a language modeling approach to audio generation, and many more

Oct 12, 2022

Hey Folks,

This week in deep learning, we bring you Google's Imagen Video, A/B Testing with Kubernetes and Seldon Core, Quantum Convolutional Neural Networks, and a paper on a language modeling approach to audio generation.

You may also enjoy TensorFlow's CircularNet, japanese stable diffusion, a neural radiance field acceleration toolbox, a paper on discovering faster matrix multiplication algorithms using reinforcement learning, and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!

Industry

Microsoft open-sources AI algorithms for optimizing farm operations

Microsoft open-sourced FarmVibes.AI, a collection of artificial intelligence models that farm operators can use to perform tasks such as planting crops more efficiently.

Google answers Meta’s video-generating AI with its own, dubbed Imagen Video

Not to be outdone by Meta’s Make-A-Video, Google details its work on Imagen Video, an AI system that can generate video clips given a text prompt.

AI Esperanto: Large Language Models Read Data With NVIDIA Triton

Companies bringing natural language processing to many markets are turning to Triton for AI inference.

CircularNet: Reducing waste with Machine Learning

The TensorFlow team introduces “CircularNet”, a set of models that lowers barriers to AI/ML tech for waste identification and all the benefits this new level of transparency can offer.

MLOps

A practical guide to A/B Testing in MLOps with Kubernetes and Seldon Core

A blog post that shows how to create a containerised microservice architecture that is easy to deploy, monitor and scale every time A/B tests are run.

Tracking JAX and Flax models with Comet

This article covers how to track JAX and Flax models with Comet.

Managing GPU Costs for Production AI

s blog focuses on optimizations you can make to generate cost savings while using GPUs for running inferences in production.

A Dagster Crash Course

A crash course on how to use Dagster, and how it compares to other orchestrators.

Learning

A Quick Guide to Quantum Convolutional Neural Networks

An article that discusses at high-level existing research and applications of quantum deep learning, focusing on hybrid quantum convolutional neural networks (QCNNs).

Japanese Stable Diffusion

In this blog, we will discuss the background of the development of Japanese Stable Diffusion and its learning methodology.

Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off

An article explaining how concept embedding models go beyond the current accuracy-explainability tradeoff.

How Wish A/B tests percentiles

In this article, Wish shares their journey of developing an A/B testing methodology for percentiles.

How to Implement Multi-Head Attention From Scratch in TensorFlow and Keras

In this tutorial, you will discover how to implement multi-head attention from scratch in TensorFlow and Keras.

Libraries & Code

lightly

A computer vision framework for self-supervised learning.

NerfAcc

A PyTorch Nerf acceleration toolbox for both training and inference. It focuses on efficient volumetric rendering of radiance fields.

ForML

A development framework for researching and implementing data science projects as well as an MLOps platform capable of managing their entire life cycles.

Papers & Publications

Discovering faster matrix multiplication algorithms with reinforcement learning

Abstract:

Improving the efficiency of algorithms for fundamental computations can have a widespread impact, as it can affect the overall speed of a large amount of computations. Matrix multiplication is one such primitive task, occurring in many systems—from neural networks to scientific computing routines. The automatic discovery of algorithms using machine learning offers the prospect of reaching beyond human intuition and outperforming the current best human-designed algorithms. However, automating the algorithm discovery procedure is intricate, as the space of possible algorithms is enormous. Here we report a deep reinforcement learning approach based on AlphaZero1 for discovering efficient and provably correct algorithms for the multiplication of arbitrary matrices. Our agent, AlphaTensor, is trained to play a single-player game where the objective is finding tensor decompositions within a finite factor space. AlphaTensor discovered algorithms that outperform the state-of-the-art complexity for many matrix sizes. Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago2. We further showcase the flexibility of AlphaTensor through different use-cases: algorithms with state-of-the-art complexity for structured matrix multiplication and improved practical efficiency by optimizing matrix multiplication for runtime on specific hardware. Our results highlight AlphaTensor’s ability to accelerate the process of algorithmic discovery on a range of problems, and to optimize for different criteria.

AudioLM: a Language Modeling Approach to Audio Generation

Abstract:

We introduce AudioLM, a framework for high-quality audio generation with long-term consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts audio generation as a language modeling task in this representation space. We show how existing audio tokenizers provide different trade-offs between reconstruction quality and long-term structure, and we propose a hybrid tokenization scheme to achieve both objectives. Namely, we leverage the discretized activations of a masked language model pre-trained on audio to capture long-term structure and the discrete codes produced by a neural audio codec to achieve high-quality synthesis. By training on large corpora of raw audio waveforms, AudioLM learns to generate natural and coherent continuations given short prompts. When trained on speech, and without any transcript or annotation, AudioLM generates syntactically and semantically plausible speech continuations while also maintaining speaker identity and prosody for unseen speakers. Furthermore, we demonstrate how our approach extends beyond speech by generating coherent piano music continuations, despite being trained without any symbolic representation of music.

Learning Bias-reduced Word Embeddings Using Dictionary Definitions

Abstract:

Pre-trained word embeddings, such as GloVe, have shown undesirable gender, racial, and religious biases. To address this problem, we propose DD-GloVe, a train-time de-biasing algorithm to learn word embeddings by leveraging dictionary definitions. We introduce dictionary-guided loss functions that encourage word embeddings to be similar to their relatively neutral dictionary definition representations. Existing de-biasing algorithms typically need a pre-compiled list of seed words to represent the bias direction, along which biased information gets removed. Producing this list involves subjective decisions and it might be difficult to obtain for some types of biases. We automate the process of finding seed words: our algorithm starts from a single pair of initial seed words and automatically finds more words whose definitions display similar attributes traits. We demonstrate the effectiveness of our approach with benchmark evaluations and empirical analyses.

A guest post by

Miko Planas

~~~

Deep Learning Weekly

Deep Learning Weekly: Issue #271

Google's Imagen Video, A/B Testing with Kubernetes and Seldon Core, Quantum Convolutional Neural Networks, a paper on a language modeling approach to audio generation, and many more

Industry

MLOps

Learning

Libraries & Code

Papers & Publications

Discussion about this post