Deep Learning Weekly: Issue #240
Google AI's new training strategy for action recognition, MLOps State of Affairs, a deep dive into GAN Failure Modes and more.
This week in deep learning, we bring you Google AI's new training strategy for action recognition, MLOps State of Affairs, a deep dive into GAN Failure Modes and a paper on the highly efficient implementation of the protein structure prediction model called FastFold.
You may also enjoy MIT's fairness technique called Partial Attribute Decorrelation, a code generation model that writes better C than Codex, Pytorch's LazyTensor System Performance, a paper on a high resolution, 3D-consistent image and shape generation technique, and more.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Injecting fairness into machine-learning models
MIT researchers develop a new technique, called Partial Attribute Decorrelation (PARADE), that boosts models’ ability to reduce bias, even if the dataset used to train the model is unbalanced.
Co-training Transformer with Videos and Images Improves Action Recognition
Google AI proposes a training strategy, named CoVeR, that leverages both image and video data to jointly learn a single general-purpose action recognition model.
The benefits of peripheral vision for machines
MIT Researchers find similarities between how some computer vision systems process images and how humans see out of the corners of our eyes.
Graphcore debuts 3D AI chip with new wafer-on-wafer technology
Chip startup Graphcore Ltd. introduced a new artificial intelligence processor, the Bow IPU, that uses an innovation dubbed wafer-on-wafer technology to speed up calculations.
PolyCoder is an open source AI code-generator that researchers claim trumps Codex
Researchers at Carnegie Mellon University — Frank Xu, Uri Alon, Graham Neubig, and Vincent Hellendoorn — developed PolyCoder, a model based on OpenAI’s GPT-2 language model that was trained on a database of 249GB of code across 12 programming languages.
MLOps is a Mess but that is to be Expected
A comprehensive discussion about the state of machine learning operations (MLOps) today, where we are, where we are going.
Bootstrapping Labels via Supervision & Human-In-The-Loop
A write-up that discusses semi, active, and weakly supervised learning, along with human-in-the-loop examples from DoorDash, Facebook, Google, and Apple.
Amazon Ads Uses PyTorch and AWS Inferentia to Scale Models for Ads Processing
A case study on how Amazon Ads uses PyTorch, TorchServe, and AWS Inferentia to reduce inference costs by 71% and drive scale out.
How to Build a Customized Panel in Comet
A tutorial on how to build a custom Panel in Comet using the provided dashboard and SDK.
Deploy AI Workloads at Scale with Bottlerocket and NVIDIA-Powered Amazon EC2 Instances
AWS and NVIDIA have collaborated to enable Bottlerocket, a minimal Linux COOS for AI workloads, to support all NVIDIA-powered Amazon EC2 instances including P4d, P3, G4dn, and G5.
Understanding LazyTensor System Performance with PyTorch/XLA on Cloud TPU
A post exploring some of the basic concepts of the LazyTensor System with the goal of applying these concepts to understand and debug performance of LazyTensor based implementations in PyTorch.
GANs Failure Modes: How to Identify and Monitor Them
In this article, we’ll see how to train a stable GAN model and then we’ll play around with the training process to understand the possible reasons for mode failures.
What's a Recommender System? NVIDIA's Even Oldridge Explains
To dig into how recommender systems work — and why these systems are being harnessed by companies in industries around the globe — Kravitz spoke to Even Oldridge, senior manager for the Merlin team at NVIDIA.
Lessons Learned on Language Model Safety and Misuse
OpenAI describes their latest thinking in the hope of helping other AI developers address safety and misuse of deployed models.
Libraries & Code
Ncnn is a high-performance neural network inference computing framework optimized for mobile platforms.
jina-ai/jina: Cloud-native neural search framework
Jina is a neural search framework that empowers anyone to build SOTA and scalable neural search applications in minutes.
Papers & Publications
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
Protein structure prediction is an important method for understanding gene translation and protein function in the domain of structural biology. AlphaFold introduced the Transformer model to the field of protein structure prediction with atomic accuracy. However, training and inference of the AlphaFold model are time-consuming and expensive because of the special performance characteristics and huge memory consumption. In this paper, we propose FastFold, a highly efficient implementation of the protein structure prediction model for training and inference. FastFold includes a series of GPU optimizations based on a thorough analysis of AlphaFold's performance. Meanwhile, with Dynamic Axial Parallelism and Duality Async Operation, FastFold achieves high model parallelism scaling efficiency, surpassing existing popular model parallelism techniques. Experimental results show that FastFold reduces overall training time from 11 days to 67 hours and achieves 7.5-9.5X speedup for long-sequence inference. Furthermore, we scaled FastFold to 512 GPUs and achieved an aggregate of 6.02 PetaFLOPs with 90.1% parallel efficiency.
StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation
We introduce a high resolution, 3D-consistent image and shape generation technique which we call StyleSDF. Our method is trained on single-view RGB data only, and stands on the shoulders of StyleGAN2 for image generation, while solving two main challenges in 3D-aware GANs: 1) high-resolution, view-consistent generation of the RGB images, and 2) detailed 3D shape. We achieve this by merging a SDF-based 3D representation with a style-based 2D generator. Our 3D implicit network renders low-resolution feature maps, from which the style-based network generates view-consistent, 1024x1024 images. Notably, our SDF-based 3D modeling defines detailed 3D surfaces, leading to consistent volume rendering. Our method shows higher quality results compared to state of the art in terms of visual and geometric quality.