Deep Learning Weekly: Issue #198

An alternative to GPT-3, the final projects of the OpenAI scholars, Intel’s photorealism enhancement model for an elevated GTA V experience, a game theory reformulation of PCA, and more

Hey folks,

This week in deep learning, we bring you a free alternative to GPT-3, a discrete choice and neural network model that enhances travel behavior research, the final projects of the OpenAI scholars and a game theory reformulation of PCA.

You may also enjoy Intel's photorealism enhancement model that makes GTA V look more realistic, USPS's adoption of Edge AI and Triton for item tracking, a paper on diffusion models, a paper on efficient anchor-free object detector guidance, and more!

As always, happy reading and hacking. If you have something you think should be in next week’s issue, find us on Twitter: @dl_weekly.

Until next week!


GPT-3’s free alternative GPT-Neo is something to be excited about

EleutherAI released two GPT-style language models, GPT-Neo 1.3B and GPT-Neo 2.7B, trained on an 825 GB dataset using Google’s TPU Research Cloud. GPT-Neo 2.7B outperforms GPT-3 Ada, its closest competitor in terms of parameter size.

SMART breakthrough uses artificial neural networks to enhance travel behavior research

Researchers from Singapore-MIT Alliance for Research and Technology have created a framework known as TB-ResNet, which combines discrete choice models and deep neural networks to improve travel behavior research.

Intel is using machine learning to make GTA V look incredibly, unsettlingly realistic

Intel researchers Stephan R. Richter, Hassan Abu Alhaija, and Vladlen Kolten created a photorealism enhancement model that uses the Cityscapes dataset to make GTA V realistic at interactive rates.

New Bayesian system cleans messy data tables automatically

MIT researchers have created a domain-specific and automatic data table cleaning system, called PClean, based on Bayesian probability and recent progress in probabilistic programming.

IBM taps AI for new workflow automation and data migration tools

IBM unveiled AI-powered enterprise products such as Mono2Micro, for streamlining cloud app migration, and Watson Orchestrate which automates work in business tools from Salesforce, SAP, and Workday. 

Too Perilous For AI? EU Proposes Risk-Based Rules

The European Commission recently published a proposal for regulations and risk management to govern artificial intelligence use in the European Union.

Mobile & Edge

Building a TinyML Application with TF Micro and SensiML

A comprehensive tutorial for creating a TinyML application for the Arduino Nano 33 BLE Sense that is capable of recognizing different boxing punches in real-time using Gyroscope and Accelerometer sensor data.

USPS Adopts Edge AI and Triton for Item Tracking

A USPS Architect, along with half a dozen NVIDIA architects, designed the Edge Computing Infrastructure Program (ECIP), a distributed edge AI system meant for large-scale image analysis and other deep learning tasks.

Portenta Machine Control: Add a powerful brain to your machines

A fully-centralized, low-power, industrial control unit that enables a wide range of predictive maintenance and AI use cases.

Under $100 and Less Than 1mW: Pneumonia Detection Solution for Everyone

A short article detailing a pneumonia detection solution using Edge Impulse Studio and balenaCloud on a Raspberry Pi.


Game theory as an engine for large-scale data analysis: EigenGame maps out a new approach to solve fundamental ML problems

DeepMind presents a reformulation of Principal Component Analysis, a type of eigenvalue problem, as a competitive multi-agent game we call EigenGame.

Towards Human-Centered Explainable AI: the journey so far

A technical proposal regarding the Reflective Human-Centered Explainable AI (HCXAI), a sociotechnically informed mindset that is grounded in critical AI studies and HCI.

ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

A comprehensive blog on a new dual-encoder architecture trained via a contrastive loss for relatively noisy vision-language datasets such as Conceptual Captions.

OpenAI Scholars 2021: Final Projects

OpenAI Scholars showcase their final projects exploring topics like AI safety, contrastive learning, generative modeling, scaling laws, auto-encoding multi-objective tasks, test time compute, NLP segmentation strategies, and summarization from human feedback.

Libraries & Code

dagster-io/dagster: A data orchestrator for machine learning, analytics, and ETL.

A data orchestrator for machine learning, analytics, and ETL.

udacity/ML_SageMaker_Studies: Case studies, examples, and exercises for learning to deploy ML models using AWS SageMaker.

A number of tutorial notebooks for various case studies, exercises, and project files that illustrate parts of the SageMaker ML workflow and deployment.

Papers & Publications

Diffusion Models Beat GANs on Image Synthesis


We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for sample quality using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128×128, 4.59 on ImageNet 256×256, and 7.72 on ImageNet 512×512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.85 on ImageNet 512×512.

Separate but Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data


We propose FEDENHANCE, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients. We simulate a real-world scenario where each client only has access to a few noisy recordings from a limited and disjoint number of speakers (hence non-IID). Each client trains their model in isolation using mixture invariant training while periodically providing updates to a central server. Our experiments show that our approach achieves competitive enhancement performance compared to IID training on a single device and that we can further facilitate the convergence speed and the overall performance using transfer learning on the server-side. Moreover, we show that we can effectively combine updates from clients trained locally with supervised and unsupervised losses. We also release a new dataset LibriFSD50K and its creation recipe in order to facilitate FL research for source separation problems.

PAFNet: An Efficient Anchor-Free Object Detector Guidance


Object detection is a basic but challenging task in computer vision, which plays a key role in a variety of industrial applications. However, object detectors based on deep learning usually require greater storage requirements and longer inference time, which hinders its practicality seriously. Therefore, a trade-off between effectiveness and efficiency is necessary in practical scenarios. Considering that without constraint of pre-defined anchors, anchor-free detectors can achieve acceptable accuracy and inference speed simultaneously. In this paper, we start from an anchor-free detector called TTFNet, modify the structure of TTFNet and introduce multiple existing tricks to realize effective server and mobile solutions respectively. Since all experiments in this paper are conducted based on PaddlePaddle, we call the model as PAFNet(Paddle Anchor Free Network). For server side, PAFNet can achieve a better balance between effectiveness (42.2% mAP) and efficiency (67.15 FPS) on a single V100 GPU. For moblie side, PAFNet-lite can achieve a better accuracy of (23.9% mAP) and 26.00 ms on Kirin 990 ARM CPU, outperforming the existing state-of-the-art anchor-free detectors by significant margins.