Deep Learning Weekly: Issue #316

DeepMind's SynthID, Mastering ML Deployment, Designing Deep Networks to Process Other Deep Networks, a paper on Teaching Algorithmic Reasoning via In-context Learning, and many more!

Aug 30, 2023

This week in deep learning, we bring you DeepMind's SynthID, Mastering ML Deployment, Designing Deep Networks to Process Other Deep Networks, and a paper on Teaching Algorithmic Reasoning via In-context Learning.

You may also enjoy Meta AI's Code Llama, Concrete ML, Exploring the Use of Adversarial Learning in Improving Model Robustness, a paper on Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers, and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!

Industry

Introducing Code Llama, a state-of-the-art large language model for coding

Meta AI released Code Llama, a commercially free, state-of-the-art LLM capable of generating code, and natural language about code, from various prompts.

Asana's State of AI at Work Report: AI is poised to dramatically improve the way we work, but proper guidance is needed

Asana, a leading work management platform, released The State of AI at Work Report, powered by insights from its Work Innovation Lab.

Introducing ChatGPT Enterprise

OpenAI launched ChatGPT Enterprise, which offers enterprise-grade security and privacy, longer context windows for processing longer inputs, advanced data analysis capabilities, and much more.

Organize Your Prompt Engineering with CometLLM

Comet announced CometLLM, a solution for logging and visualizing all your prompts and chains to unleash the full potential of Large Language Models.

Identifying AI-generated images with SynthID

DeepMind launched a beta version of SynthID, a tool for watermarking and identifying AI-generated images.

MLOps & LLMOps

MLOps: Mastering Machine Learning Deployment: An Intro to Docker, Kubernetes, Helm, and Modern Web Frameworks-End To End Project

An article about how to use various tools and frameworks to deploy machine learning models in a streamlined and scalable way, covering topics such as Docker, Kubernetes, Helm, Terraform, Streamlit, Gradio, FastAPI, and more.

Lessons Learnt From Consolidating ML Models in a Large Scale Recommendation System

Netflix shares system design lessons from consolidating several related machine learning models for large-scale search and recommendation systems into a single unified model.

Randomizing very large datasets

Consider the problem of randomizing a dataset that is so large, it doesn’t even fit into memory. This article describes how you can do it easily and (relatively) quickly in Python.

ML pipelines for fine-tuning LLMs

An article that shares findings and demonstrates best practices in creating a clean production ML pipeline for fine-tuned LLMs.

Learning

Designing Deep Networks to Process Other Deep Networks

An article about a new type of neural network that can process the weights of other neural networks while being invariant or equivariant to certain transformations of the input weights.

Making LLMs lighter with AutoGPTQ and transformers

A blogpost that presents the integration of the AutoGPTQ library in Transformers, making it possible to quantize LLMs with the GPTQ method.

Exploring the Use of Adversarial Learning in Improving Model Robustness

A comprehensive article that explores the use of adversarial learning in improving the robustness of machine learning models.

Libraries & Code

Paulescu/hands-on-train-and-deploy-ml

Train and Deploy an ML REST API to predict crypto prices, in 10 steps

zama-ai/concrete-ml

Concrete ML is a Privacy-Preserving Machine Learning (PPML) open-source set of tools built on top of Concrete by Zama.

Papers & Publications

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Abstract:

We present Region-aware Open-vocabulary Vision Transformers (RO-ViT) - a contrastive image-text pretraining recipe to bridge the gap between image-level pretraining and open-vocabulary object detection. At the pretraining phase, we propose to randomly crop and resize regions of positional embeddings instead of using the whole image positional embeddings. This better matches the use of positional embeddings at region-level in the detection finetuning phase. In addition, we replace the common softmax cross entropy loss in contrastive learning with focal loss to better learn the informative yet difficult examples. Finally, we leverage recent advances in novel object proposals to improve open-vocabulary detection finetuning. We evaluate our full model on the LVIS and COCO open-vocabulary detection benchmarks and zero-shot transfer. RO-ViT achieves a state-of-the-art 34.1 APr on LVIS, surpassing the best existing approach by +7.8 points in addition to competitive zero-shot transfer detection. Surprisingly, RO-ViT improves the image-level representation as well and achieves the state of the art on 9 out of 12 metrics on COCO and Flickr image-text retrieval benchmarks, outperforming competitive approaches with larger models.

Graph of Thoughts: Solving Elaborate Problems with Large Language Models

Abstract:

We introduce Graph of Thoughts (GoT): a framework that advances prompting capabilities in large language models (LLMs) beyond those offered by paradigms such as Chain-of-Thought or Tree of Thoughts (ToT). The key idea and primary advantage of GoT is the ability to model the information generated by an LLM as an arbitrary graph, where units of information ("LLM thoughts") are vertices, and edges correspond to dependencies between these vertices. This approach enables combining arbitrary LLM thoughts into synergistic outcomes, distilling the essence of whole networks of thoughts, or enhancing thoughts using feedback loops. We illustrate that GoT offers advantages over state of the art on different tasks, for example increasing the quality of sorting by 62% over ToT, while simultaneously reducing costs by >31%. We ensure that GoT is extensible with new thought transformations and thus can be used to spearhead new prompting schemes. This work brings the LLM reasoning closer to human thinking or brain mechanisms such as recurrence, both of which form complex networks.

Teaching Algorithmic Reasoning via In-context Learning

Abstract:

Large language models (LLMs) have shown increasing in-context learning capabilities through scaling up model and data size. Despite this progress, LLMs are still unable to solve algorithmic reasoning problems. While providing a rationale with the final answer has led to further improvements in multi-step reasoning problems, Anil et al. 2022 showed that even simple algorithmic reasoning tasks such as parity are far from solved. In this work, we identify and study four key stages for successfully teaching algorithmic reasoning to LLMs: (1) formulating algorithms as skills, (2) teaching multiple skills simultaneously (skill accumulation), (3) teaching how to combine skills (skill composition) and (4) teaching how to use skills as tools. We show that it is possible to teach algorithmic reasoning to LLMs via in-context learning, which we refer to as algorithmic prompting. We evaluate our approach on a variety of arithmetic and quantitative reasoning tasks, and demonstrate significant boosts in performance over existing prompting techniques. In particular, for long parity, addition, multiplication and subtraction, we achieve an error reduction of approximately 10x, 9x, 5x and 2x respectively compared to the best available baselines.

A guest post by

Miko Planas

~~~

Deep Learning Weekly

Deep Learning Weekly: Issue #316

DeepMind's SynthID, Mastering ML Deployment, Designing Deep Networks to Process Other Deep Networks, a paper on Teaching Algorithmic Reasoning via In-context Learning, and many more!

Industry

MLOps & LLMOps

Learning

Libraries & Code

Papers & Publications

Discussion about this post