Deep Learning Weekly: Issue #249
Meta democratizes access to OPT-175B, Google's automated model-parallel deep learning, Graph Isomorphism Networks, and more
This week in deep learning, Meta democratizes access to OPT-175B, Google's automated model-parallel deep learning, Graph Isomorphism Networks, and a paper on evaluating gradient inversion attacks and defenses in federated learning.
You may also enjoy advances in GNN benchmarking, deploying ML pipelines with Terraform and AWS, an introduction to deep reinforcement learning, a paper on a next-generation knowledge construction and serving platform for powering knowledge-based applications at industrial scale, and more.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Meta is sharing Open Pre-trained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available datasets, to allow for more community engagement in understanding this foundational new technology.
Google introduces a methodology for analyzing the performance of GNN architectures on millions of synthetic benchmark datasets.
Google describes a novel method for automating the complex process of parallelizing a model.
Intel Corp. expands its product portfolio with several new chips, including an AI processor that it promises will provide twice the performance of NVIDIA’s flagship A100 graphics card.
This article dives into realistic scenarios beyond surface accuracy performances of machine learning (ML) models that one should consider before putting them into production.
An article highlighting how Kubernetes is providing value for machine learning projects, and describing Kubeflow — a commonly used open source tool that can be used to automate pipelines on Kubernetes.
This post provides code and walks you through the steps necessary to deploy AWS infrastructure for ML pipelines with Terraform for model training and inference using Amazon SageMaker.
An article on building an end-to-end MLOps platform using open-source tooling that can facilitate operationalization of myriads of machine learning use cases.
Pete Warden’s take on how to protect your machine learning models by treating your training data like you do your traditional source code and treating your model files like compiled executables.
A theoretical blog on the limits of the current implementation of Graph Neural Networks and how these can go beyond message passing.
This article explores how synthetic data can be artificially produced with transformer models, using NVIDIA NeMo as an example.
This article is the first unit of Deep Reinforcement Learning Class, a free class from beginner to expert where you’ll learn the theory and practice using famous Deep RL libraries such as Stable Baselines3, RL Baselines3 Zoo, and RLlib.
A technical blog that introduces Graph Isomorphism Networks, details its advantages in terms of discriminative power compared to a GCN or GraphSAGE, and provides code for its implementation.
Libraries & Code
A package that provides a simple interface for fitting and using state-of-the-art interpretable models, all compatible with scikit-learn.
An extensible, open-source MLOps framework to create production-ready machine learning pipelines.
Papers & Publications
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Many previous works have shown impressive reconstruction results on textured objects, but they still have difficulty in handling low-textured planar regions, which are common in indoor scenes. An approach to solving this issue is to incorporate planer constraints into the depth map estimation in multi-view stereo-based methods, but the per-view plane estimation and depth optimization lack both efficiency and multi-view consistency. In this work, we show that the planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. Specifically, we use an MLP network to represent the signed distance function as the scene geometry. Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network. To resolve the inaccurate segmentation, we encode the semantics of 3D points with another MLP and design a novel loss that jointly optimizes the scene geometry and semantics in 3D space. Experiments on ScanNet and 7-Scenes datasets show that the proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data. This paper evaluates existing attacks and defenses. We find that some attacks make strong assumptions about the setup. Relaxing such assumptions can substantially weaken these attacks. We then evaluate the benefits of three proposed defense mechanisms against gradient inversion attacks. We show the trade-offs of privacy leakage and data utility of these defense methods, and find that combining them in an appropriate manner makes the attack less effective, even under the original strong assumptions. We also estimate the computation cost of end-to-end recovery of a single image under each evaluated defense. Our findings suggest that the state-of-the-art attacks can currently be defended against with minor data utility loss, as summarized in a list of potential strategies.
We introduce Saga, a next-generation knowledge construction and serving platform for powering knowledge-based applications at industrial scale. Saga follows a hybrid batch-incremental design to continuously integrate billions of facts about real-world entities and construct a central knowledge graph that supports multiple production use cases with diverse requirements around data freshness, accuracy, and availability. In this paper, we discuss the unique challenges associated with knowledge graph construction at industrial scale, and review the main components of Saga and how they address these challenges. Finally, we share lessons-learned from a wide array of production use cases powered by Saga.