Deep Learning Weekly : Issue #321
A New Family of Physics-inspired Generative Models, Microsoft's AutoGen, Problems of AI Consciousness, a paper on DreamGaussian, and many more!
This week in deep learning, we bring you A New Family of Physics-inspired Generative Models, Microsoft's AutoGen, Problems of AI Consciousness, and a paper on DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation.
You may also enjoy Stable LM 3B, Training Foundation Improvements for Closeup Recommendation Ranker, LoRA or Full-Parameter on Llama 2, a paper on Boolformer: Symbolic Regression of Logic Functions with Transformers, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
From physics to generative AI: An AI model for advanced pattern generation
Researchers introduced a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM) for better pattern recognition.
Introducing Stable LM 3B: Bringing Sustainable, High-Performance Language Models to Smart Devices
Stability AI proudly launched an experimental version of Stable LM 3B, the latest in a suite of high-performance generative AI solutions.
Helios unveils AI analyst Cersi for tracking food supply chain disruptions
Helios introduced Cersi, a conversational AI chatbot, similar to ChatGPT or Claude 2, but specialized for the agricultural supply chain.
Asana leans on AI to boost productivity and help companies benchmark their employees' performance
Asana announced that it’s integrating AI into multiple aspects of its work management platform, taking advantage of its Work Graph architecture.
Stampli reels in $61M for its AI-powered accounting platform
Stampli, a startup using AI to make accounting teams more productive, announced that it has closed a $61 million late-stage funding round led by Blackstone.
MLOps & LLMOps
Training Foundation Improvements for Closeup Recommendation Ranker
A deeper look into Pinterest’s training foundations for their Closeup Recommendation system and the Auto-Retraining Framework (ARF) used to keep models up-to-date
Prompt Engineering Evolution: Defining the New Program Simulation Prompt Framework
A prompt engineering guide that highlights a new technique called program simulation prompting.
An article that describes the technical and architectural challenges of building a distributed system for synchronizing and ingesting billions of text embeddings for RAG applications.
Enhancing Customer Churn Prediction with Continuous Experiment Tracking
A deep dive into a machine learning project aimed at predicting customer churn and exploring Comet ML.
Retrieval Augmented Generation on audio data with LangChain and Chroma
An article on how to perform RAG on audio data using LangChain and Chroma.
Learning
An Introduction to the Problems of AI Consciousness
A blog post that highlights key definitions and ideas from philosophy and science relevant for the debates on AI consciousness.
Fine-Tuning LLMs: LoRA or Full-Parameter? An in-depth Analysis with Llama 2
A blog post that compares full-parameter fine-tuning with LoRA, as well as answers questions around the strengths and weaknesses of the two techniques.
An article about different types of quantization used to reduce the memory footprint of models like Llama 2.
How to Build an Interactive Chat-Generation Model using DialoGPT and PyTorch
An article that showcases how to create interactive chats based on a pre-trained DialoGPT model from Hugging Face with the addition of the Intel Extension for PyTorch.
Libraries & Code
AutoGen is a framework that enables development of LLM applications using multiple agents that can converse with each other to solve tasks.
A minimal and easy-to-use keyword extraction tool that leverages BERT embeddings.
Papers & Publications
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
Abstract:
Recent advances in 3D content creation mostly leverage optimization-based 3D generation via score distillation sampling (SDS). Though promising results have been exhibited, these methods often suffer from slow per-sample optimization, limiting their practical usage. In this paper, we propose DreamGaussian, a novel 3D content generation framework that achieves both efficiency and quality simultaneously. Our key insight is to design a generative 3D Gaussian Splatting model with companioned mesh extraction and texture refinement in UV space. In contrast to the occupancy pruning used in Neural Radiance Fields, we demonstrate that the progressive densification of 3D Gaussians converges significantly faster for 3D generative tasks. To further enhance the texture quality and facilitate downstream applications, we introduce an efficient algorithm to convert 3D Gaussians into textured meshes and apply a fine-tuning stage to refine the details. Extensive experiments demonstrate the superior efficiency and competitive generation quality of our proposed approach. Notably, DreamGaussian produces high-quality textured meshes in just 2 minutes from a single-view image, achieving approximately 10 times acceleration compared to existing methods.
Boolformer: Symbolic Regression of Logic Functions with Transformers
Abstract:
In this work, we introduce Boolformer, the first Transformer architecture trained to perform end-to-end symbolic regression of Boolean functions. First, we show that it can predict compact formulas for complex functions which were not seen during training, when provided a clean truth table. Then, we demonstrate its ability to find approximate expressions when provided incomplete and noisy observations. We evaluate the Boolformer on a broad set of real-world binary classification datasets, demonstrating its potential as an interpretable alternative to classic machine learning methods. Finally, we apply it to the widespread task of modelling the dynamics of gene regulatory networks. Using a recent benchmark, we show that Boolformer is competitive with state-of-the art genetic algorithms with a speedup of several orders of magnitude.
Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization
Abstract:
We develop a re-weighted gradient descent technique for boosting the performance of deep neural networks. Our algorithm involves the importance weighting of data points during each optimization step. Our approach is inspired by distributionally robust optimization with f-divergences, which has been known to result in models with improved generalization guarantees. Our re-weighting scheme is simple, computationally efficient, and can be combined with any popular optimization algorithms such as SGD and Adam. Empirically, we demonstrate our approach's superiority on various tasks, including vanilla classification, classification with label imbalance, noisy labels, domain adaptation, and tabular representation learning. Notably, we obtain improvements of +0.7% and +1.44% over SOTA on DomainBed and Tabular benchmarks, respectively. Moreover, our algorithm boosts the performance of BERT on GLUE benchmarks by +1.94%, and ViT on ImageNet-1K by +0.9%. These results demonstrate the effectiveness of the proposed approach, indicating its potential for improving performance in diverse domains.