Deep Learning Weekly: Issue #249
Meta democratizes access to OPT-175B, Google's automated model-parallel deep learning, Graph Isomorphism Networks, and more
This week in deep learning, Meta democratizes access to OPT-175B, Google's automated model-parallel deep learning, Graph Isomorphism Networks, and a paper on evaluating gradient inversion attacks and defenses in federated learning.
You may also enjoy advances in GNN benchmarking, deploying ML pipelines with Terraform and AWS, an introduction to deep reinforcement learning, a paper on a next-generation knowledge construction and serving platform for powering knowledge-based applications at industrial scale, and more.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Democratizing access to large-scale language models with OPT-175B
Meta is sharing Open Pre-trained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available datasets, to allow for more community engagement in understanding this foundational new technology.
GraphWorld: Advances in Graph Benchmarking
Google introduces a methodology for analyzing the performance of GNN architectures on millions of synthetic benchmark datasets.
Alpa: Automated Model-Parallel Deep Learning
Google describes a novel method for automating the complex process of parallelizing a model.
Intel debuts new chips for AI workloads, data center acceleration and laptops
Intel Corp. expands its product portfolio with several new chips, including an AI processor that it promises will provide twice the performance of NVIDIA’s flagship A100 graphics card.
5 Must-Do Error Analysis Before You Put Your Model in Production
This article dives into realistic scenarios beyond surface accuracy performances of machine learning (ML) models that one should consider before putting them into production.
Kubernetes for AI: A Practical Guide
An article highlighting how Kubernetes is providing value for machine learning projects, and describing Kubeflow — a commonly used open source tool that can be used to automate pipelines on Kubernetes.
Deploy and manage machine learning pipelines with Terraform using Amazon SageMaker
This post provides code and walks you through the steps necessary to deploy AWS infrastructure for ML pipelines with Terraform for model training and inference using Amazon SageMaker.
An End-to-End MLOps Platform Implementation using Open-source Tooling
An article on building an end-to-end MLOps platform using open-source tooling that can facilitate operationalization of myriads of machine learning use cases.
How Should you Protect your Machine Learning Models and IP?
Pete Warden’s take on how to protect your machine learning models by treating your training data like you do your traditional source code and treating your model files like compiled executables.
Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks
A theoretical blog on the limits of the current implementation of Graph Neural Networks and how these can go beyond message passing.
Generating Synthetic Data with Transformers: A Solution for Enterprise Data Challenges
This article explores how synthetic data can be artificially produced with transformer models, using NVIDIA NeMo as an example.
An Introduction to Deep Reinforcement Learning
This article is the first unit of Deep Reinforcement Learning Class, a free class from beginner to expert where you’ll learn the theory and practice using famous Deep RL libraries such as Stable Baselines3, RL Baselines3 Zoo, and RLlib.
GIN: How to Design the Most Powerful Graph Neural Network
A technical blog that introduces Graph Isomorphism Networks, details its advantages in terms of discriminative power compared to a GCN or GraphSAGE, and provides code for its implementation.
Libraries & Code
A package that provides a simple interface for fitting and using state-of-the-art interpretable models, all compatible with scikit-learn.
An extensible, open-source MLOps framework to create production-ready machine learning pipelines.
Papers & Publications
Neural 3D Scene Reconstruction with the Manhattan-world Assumption
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results on textured objects, but they still have difficulty in handling low-textured planar regions, which are common in indoor scenes. An approach to solving this issue is to incorporate planer constraints into the depth map estimation in multi-view stereo-based methods, but the per-view plane estimation and depth optimization lack both efficiency and multi-view consistency. In this work, we show that the planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. Specifically, we use an MLP network to represent the signed distance function as the scene geometry. Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network. To resolve the inaccurate segmentation, we encode the semantics of 3D points with another MLP and design a novel loss that jointly optimizes the scene geometry and semantics in 3D space. Experiments on ScanNet and 7-Scenes datasets show that the proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
Evaluating Gradient Inversion Attacks and Defenses in Federated Learning
Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data. This paper evaluates existing attacks and defenses. We find that some attacks make strong assumptions about the setup. Relaxing such assumptions can substantially weaken these attacks. We then evaluate the benefits of three proposed defense mechanisms against gradient inversion attacks. We show the trade-offs of privacy leakage and data utility of these defense methods, and find that combining them in an appropriate manner makes the attack less effective, even under the original strong assumptions. We also estimate the computation cost of end-to-end recovery of a single image under each evaluated defense. Our findings suggest that the state-of-the-art attacks can currently be defended against with minor data utility loss, as summarized in a list of potential strategies.
Saga: A Platform for Continuous Construction and Serving of Knowledge At Scale
We introduce Saga, a next-generation knowledge construction and serving platform for powering knowledge-based applications at industrial scale. Saga follows a hybrid batch-incremental design to continuously integrate billions of facts about real-world entities and construct a central knowledge graph that supports multiple production use cases with diverse requirements around data freshness, accuracy, and availability. In this paper, we discuss the unique challenges associated with knowledge graph construction at industrial scale, and review the main components of Saga and how they address these challenges. Finally, we share lessons-learned from a wide array of production use cases powered by Saga.