Deep Learning Weekly: Issue #279
Diffusion models for protein design, predicting Los Angeles traffic with Graph Neural Networks, building a virtual machine inside ChatGPT, a paper on Score Jacobian Chaining, and many more
Hey Folks
This week in deep learning, we bring you Diffusion Models used by biotech labs for protein design, predicting Los Angeles Traffic with Graph Neural Networks, building a virtual machine inside ChatGPT, and a paper on Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation.
You may also enjoy PyTorch 2.0, six ways to optimize models for deployment and inference, comparing human perception and multimodal LLMs, a paper on Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
Biotech labs are using AI inspired by DALL-E to invent new drugs
Generate Biomedicines, a Boston-based startup, and the University of Washington announced programs that use diffusion models to generate designs for novel proteins with more precision than ever before.
Cerebras Systems, the pioneer in accelerating AI compute, unveiled Andromeda, a 13.5 million core AI supercomputer, now available and being used for commercial and academic work.
PyTorch 2.0 release accelerates open-source machine learning
The first experimental release of PyTorch 2.0 is out, which promises accelerated training and development through torch.compile and other advancements.
Introducing Cohere Sandbox: Open-Source Libraries to Help Developers Experiment with Language AI
Cohere shares a collection of experimental, open-source GitHub repositories that make building applications using large language models fast and easy.
Intel Labs and Penn Medicine study uses federated learning to increase brain tumor detection
Intel Labs and the Perelman School of Medicine at the University of Pennsylvania released a joint research study that used federated learning to help healthcare institutions identify malignant brain tumors.
MLOps
Building Airbnb Categories with ML and Human-in-the-Loop
A high-level introductory post about Airbnb applied machine learning to build out the listing collections and to solve different tasks related to the browsing experience–specifically, quality estimation, photo selection and ranking.
Optimizing Models for Deployment and Inference
This article dives into six ways you can manage and optimize your models for deployment and inference.
Monitoring ML Models with FastAPI and Evidently AI
A technical blog that demonstrates how to monitor models with FastAPI and Evidently AI.
Learning
Predicting Los Angeles Traffic with Graph Neural Networks
This post explores the use of GNNs for traffic forecasting, and in particular explores the ST-GAT model developed by Zhang et al in “Spatial-Temporal Graph Attention Networks: A Deep Learning Approach for Traffic Forecasting.”
A guide on how to use Comet’s new open-source data exploration tool, Kangas.
Building A Virtual Machine inside ChatGPT
An interesting guide on how to build a virtual machine inside ChatGPT.
YOLOv7: A deep dive into the current state-of-the-art for object detection
A comprehensive tutorial which covers how to train YOLOv7 models in custom training scripts, explores areas such as data augmentation techniques, highlights how to select and modify anchor boxes, and demystifies how the loss function works.
Learning to Make the Right Mistakes
An article that briefly compares human perception and multimodal LLMs.
An in-depth tutorial that covers how to build a multi-objective recommender for Movielens, using both implicit (movie watches) and explicit signals (ratings).
Libraries & Code
The Language Interpretability Tool: Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface.
This version of Stable Diffusion features a web interface, an interactive command-line script that combines text2img and img2img functionality, and other enhancements.
Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.
Build, train, and fine-tune production-ready deep learning SOTA vision models.
Papers & Publications
Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
Abstract:
A diffusion model learns to predict a vector field of gradients. We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field. This setup aggregates 2D scores at multiple camera viewpoints into a 3D score, and repurposes a pretrained 2D model for 3D data generation. We identify a technical challenge of distribution mismatch that arises in this application, and propose a novel estimation mechanism to resolve it. We run our algorithm on several off-the-shelf diffusion image generative models, including the recently released Stable Diffusion trained on the large-scale LAION dataset.
Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
Abstract:
Most existing Image Restoration (IR) models are task-specific, which can not be generalized to different degradation operators. In this work, we propose the Denoising Diffusion Null-Space Model (DDNM), a novel zero-shot framework for arbitrary linear IR problems, including but not limited to image super-resolution, colorization, inpainting, compressed sensing, and deblurring. DDNM only needs a pre-trained off-the-shelf diffusion model as the generative prior, without any extra training or network modifications. By refining only the null-space contents during the reverse diffusion process, we can yield diverse results satisfying both data consistency and realness. We further propose an enhanced and robust version, dubbed DDNM+, to support noisy restoration and improve restoration quality for hard tasks. Our experiments on several IR tasks reveal that DDNM outperforms other state-of-the-art zero-shot IR methods. We also demonstrate that DDNM+ can solve complex real-world applications, e.g., old photo restoration.
DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection
Abstract:
Visual anomaly detection, an important problem in computer vision, is usually formulated as a one-class classification and segmentation task. The student-teacher (S-T) framework has proved to be effective in solving this challenge. However, previous works based on S-T only empirically applied constraints on normal data and fused multi-level information. In this study, we propose an improved model called DeSTSeg, which integrates a pre-trained teacher network, a denoising student encoder-decoder, and a segmentation network into one framework. First, to strengthen the constraints on anomalous data, we introduce a denoising procedure that allows the student network to learn more robust representations. From synthetically corrupted normal images, we train the student network to match the teacher network feature of the same images without corruption. Second, to fuse the multi-level S-T features adaptively, we train a segmentation network with rich supervision from synthetic anomaly masks, achieving a substantial performance improvement. Experiments on the industrial inspection benchmark dataset demonstrate that our method achieves state-of-the-art performance, 98.6% on image-level ROC, 75.8% on pixel-level average precision, and 76.4% on instance-level average precision.