Deep Learning Weekly: Issue #282
2023 Predictions from NVIDIA AI Experts, ML Observability and its investment potential, Differential Privacy Series by PyTorch, a paper on reasoning over different types of knowledge graphs, and more
Hey Folks,
Happy New Year! This week in deep learning, we bring you 2023 Predictions from NVIDIA AI Experts, ML Observability and its investment potential, Differential Privacy Series by PyTorch, and a paper on reasoning over different types of knowledge graphs: static, temporal, and multi-modal.
You may also enjoy major updates to NVIDIA's Isaac Sim, a TorchServe sketch-to-animation case study, modeling recommendation systems as reinforcement learning problems, a paper on generalized decoding for pixel, image, and language, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
2023 Predictions: AI That Bends Reality, Unwinds the Golden Screw and Self-Replicates
15 NVIDIA AI experts predict digital twins and generative AI are set to advance enterprise goals and consumer needs even as the world enters a third year of planning uncertainty.
New State-of-the-Art Quantized Models Added in TF Model Garden
Tensorflow expands the coverage of QAT support and introduces new state-of-the-art quantized models in Model Garden for object detection, semantic segmentation, and natural language processing.
Machine learning salaries saw significant increase in 2022, says report
Machine learning analysts and site reliability engineers experienced a salary increase of almost one third (31pc) in 2022.
Nvidia unveils advances in autonomous robotics simulation using AI
Nvidia announced major updates to Isaac Sim, its robotics simulation tool that allows designers and engineers to build and test virtual robots in realistic environments, such as warehouses, factory floors and city streets, in order to train their AI and design them better.
Data science and analytics startup Tredence raises $175M to solve AI's last-mile problem
Data science and artificial intelligence solutions startup Tredence Inc. stated that it has closed on a bumper $175 million funding round led by Advent International.
MLOps
ML Observability — Hype or Here to Stay?
An article that delves into what ML Observability is and why it is needed. This article also covers the investment potential of observability tools and the challenges of an overload in tooling.
Torchserve Performance Tuning, Animated Drawings Case-Study
An overview of Torchserve and how to tune its performance for production use-cases. The article discusses a sketch-to-animation app and how it could serve the peak traffic with Torchserve.
Accelerating PyTorch Transformers with Intel Sapphire Rapids - part 1
In this post, you will learn how to accelerate a PyTorch training job with a cluster of Sapphire Rapids servers running on AWS.
For this blog post, we will use Amazon SageMaker Studio to showcase how to log metrics from a Studio notebook using the updated SageMaker Experiments functionalities.
Learning
Modeling Recommendation Systems as Reinforcement Learning Problem
An article that covers how recommendation systems can be mathematically modeled as reinforcement learning problems.
Exploring Image Classification With Kangas
In this article, I’ll go into greater detail about what Kangas does, walk you through installing it, and then use Kangas to classify some images.
.Differential Privacy Series by PyTorch
A 3-part series of articles on DP SGD, its underlying mathematics, and its practical applications using libraries such as Opacus.
Analyzing Machine Learning models on a layer-by-layer basis
This blog explains how to analyze a neural network on a layer-by-layer basis and builds on top of the blog post explaining how to use Vela.
Introduction to Graph Machine Learning
In this blog post, we cover the basics of graph machine learning.
Libraries & Code
Monitor deep learning model training and hardware usage from your mobile phone.
A Python package for estimating heterogeneous treatment effects from observational data via machine learning.
This repository shares example code and example prompts for accomplishing common tasks with the OpenAI API.
Papers & Publications
Cramming: Training a Language Model on a Single GPU in One Day
Abstract:
Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners. While most in the community are asking how to push the limits of extreme computation, we ask the opposite question: How far can we get with a single GPU in just one day?
We investigate the downstream performance achievable with a transformer-based language model trained completely from scratch with masked language modeling for a single day on a single consumer GPU. Aside from re-analyzing nearly all components of the pretraining pipeline for this scenario and providing a modified pipeline with performance close to BERT, we investigate why scaling down is hard, and which modifications actually improve performance in this scenario. We provide evidence that even in this constrained setting, performance closely follows scaling laws observed in large-compute settings. Through the lens of scaling laws, we categorize a range of recent improvements to training and architecture and discuss their merit and practical applicability (or lack thereof) for the limited compute setting.
Generalized Decoding for Pixel, Image, and Language
Abstract:
We present X-Decoder, a generalized decoding model that can predict pixel-level segmentation and language tokens seamlessly. X-Decodert takes as input two types of queries: (i) generic non-semantic queries and (ii) semantic queries induced from text inputs, to decode different pixel-level and token-level outputs in the same semantic space. With such a novel design, X-Decoder is the first work that provides a unified way to support all types of image segmentation and a variety of vision-language (VL) tasks. Further, our design enables seamless interactions across tasks at different granularities and brings mutual benefits by learning a common and rich pixel-level visual-semantic understanding space, without any pseudo-labeling. After pretraining on a mixed set of a limited amount of segmentation data and millions of image-text pairs, X-Decoder exhibits strong transferability to a wide range of downstream tasks in both zero-shot and fine tuning settings. Notably, it achieves (1) state-of-the-art results on open-vocabulary segmentation and referring segmentation on eight datasets; (2) better or competitive finetuned performance to other generalist and specialist models on segmentation and VL tasks; and (3) flexibility for efficient finetuning and novel task composition (e.g., referring captioning and image editing). Code, demo, video, and visualization are available at this https URL.
Reasoning over Different Types of Knowledge Graphs: Static, Temporal and Multi-Modal
Abstract:
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering and recommendation systems, etc. According to the graph types, the existing KGR models can be roughly divided into three categories, i.e., static models, temporal models, and multi-modal models. The early works in this domain mainly focus on static KGR and tend to directly apply general knowledge graph embedding models to the reasoning task. However, these models are not suitable for more complex but practical tasks, such as inductive static KGR, temporal KGR, and multi-modal KGR. To this end, multiple works have been developed recently, but no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the preliminaries, summaries of KGR models, and typical datasets are introduced and discussed consequently. Moreover, we discuss the challenges and potential opportunities.