Deep Learning Weekly: Issue #235
OpenAI's InstructGPT, a blog on distributed training, steering towards effective autonomous vehicle policy, a paper on natural language descriptions of deep visual features, and more
This week in deep learning, we bring you OpenAI's aligned and truthful models called InstructGPT, a blog on distributed training, steering towards effective autonomous vehicle policy, and a paper on natural language descriptions of deep visual features.
You may also enjoy a neural network that could identify topology from a material's XAS signature, an article on the best practices for experiment management, an introduction to the text and code embeddings in the OpenAI API, a paper on controlling neural networks with rule representations, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
OpenAI trained language models (InstructGPT) that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through alignment research.
Doctoral candidate Nina Andrejević combines spectroscopy and machine learning techniques to identify novel and valuable properties in matter.
Rail Vision, an Israel-based NVIDIA Metropolis partner, offers AI-powered obstacle-detection and classification systems for railways.
Domino Data Lab Inc. announced a big update to its platform, saying the new capabilities will help to improve “model velocity” and get artificial intelligence models into production faster.
Deepnote, a company building a solution for more collaborative data science notebooks, announced this week that it has raised $20 million in Series A funding.
A comprehensive blog discussing what distributed training is, which frameworks are involved, and how it can solve the problem of training a complex machine learning model on a huge dataset.
A blog post that highlights the best practices for standardizing your experimental process and the iterative nature it takes.
Canonical Ltd. released Charmed Kubeflow 1.4, the newest version of its platform for simplifying enterprise artificial intelligence projects.
A case study on how the Geodata team of Mapbox incrementally adopts Dagster, a pipeline orchestration platform that may soon replace Airflow.
An article showcasing how to use the NVIDIA TAO Toolkit, a CLI-based solution of the NVIDIA TAO framework, along with Appen’s data labeling platform to simplify the overall training process and create highly customized models.
An article discussing the policies, direction, ethical dilemmas, and country-specific issues related to the effective deployment of autonomous vehicles.
OpenAI introduces embeddings, a new endpoint in the OpenAI API that makes it easy to perform natural language and code tasks like semantic search, clustering, topic modeling, and classification.
Ryan Smith, machine learning research engineer at Snorkel AI, talks about prompting methods with language models and some applications they have with weak supervision.
A technical guide on how to implement a convolutional neural network, from data preparation to evaluation, using PyTorch.
An article discussing the common algorithms that are used for different applications.
Libraries & Code
Notebooks that can be used as tutorials for running machine learning workflows with LightGBM using Dask.
An RPC library to help you perform distributed machine learning research, particularly reinforcement learning.
Papers & Publications
Some neurons in deep networks specialize in recognizing highly specific perceptual, structural, or semantic features of inputs. In computer vision, techniques exist for identifying neurons that respond to individual concept categories like colors, textures, and object classes. But these techniques are limited in scope, labeling only a small subset of neurons and behaviors in any network. Is a richer characterization of neuron-level computation possible? We introduce a procedure (called MILAN, for mutual-information-guided linguistic annotation of neurons) that automatically labels neurons with open-ended, compositional, natural language descriptions. Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active. MILAN produces fine-grained descriptions that capture categorical, relational, and logical structure in learned features. These descriptions obtain high agreement with human-generated feature descriptions across a diverse set of model architectures and tasks, and can aid in understanding and controlling learned models. We highlight three applications of natural language neuron descriptions. First, we use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models. Second, we use MILAN for auditing, surfacing neurons sensitive to protected categories like race and gender in models trained on datasets intended to obscure these features. Finally, we use MILAN for editing, improving robustness in an image classifier by deleting neurons sensitive to text features spuriously correlated with class labels.
We propose a novel training method that integrates rules into deep learning, in a way the strengths of the rules are controllable at inference. Deep Neural Networks with Controllable Rule Representations (DeepCTRL) incorporates a rule encoder into the model coupled with a rule-based objective, enabling a shared representation for decision making. DeepCTRL is agnostic to data type and model architecture. It can be applied to any kind of rule defined for inputs and outputs. The key aspect of DeepCTRL is that it does not require retraining to adapt the rule strength -- at inference, the user can adjust it based on the desired operation point on accuracy vs. rule verification ratio. In real-world domains where incorporating rules is critical -- such as Physics, Retail and Healthcare -- we show the effectiveness of DeepCTRL in teaching rules for deep learning. DeepCTRL improves the trust and reliability of the trained models by significantly increasing their rule verification ratio, while also providing accuracy gains at downstream tasks. Additionally, DeepCTRL enables novel use cases such as hypothesis testing of the rules on data samples, and unsupervised adaptation based on shared rules between datasets.
By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a guiding mechanism to control the image generation process without retraining. However, since these models typically operate directly in pixel space, optimization of powerful DMs often consumes hundreds of GPU days and inference is expensive due to sequential evaluations. To enable DM training on limited computational resources while retaining their quality and flexibility, we apply them in the latent space of powerful pretrained autoencoders. In contrast to previous work, training diffusion models on such a representation allows for the first time to reach a near-optimal point between complexity reduction and detail preservation, greatly boosting visual fidelity. By introducing cross-attention layers into the model architecture, we turn diffusion models into powerful and flexible generators for general conditioning inputs such as text or bounding boxes and high-resolution synthesis becomes possible in a convolutional manner. Our latent diffusion models (LDMs) achieve a new state of the art for image inpainting and highly competitive performance on various tasks, including unconditional image generation, semantic scene synthesis, and super-resolution, while significantly reducing computational requirements compared to pixel-based DMs.