Deep Learning Weekly: Issue #190
AI that more closely mimics the mind, AI bias and civil rights, a novel method for detecting deepfakes, faster deep learning for drug discovery, and more
Hey folks,
This week in deep learning we bring you a new deep learning technique for faster drug discovery, Facebook’s plans on using users’ public videos for training, an AI platform that closely mimics the mind, a Spanish startup that now offers federated learning services, and a practical way to compute massive graphs in parallel.
You may also enjoy this tutorial on implementing real-time pose estimation on mobile, Google’s learnable frontend for audio classification, a beginner’s guide to OpenAI’s CLIP, a simple library that generates visuals from music, this paper on inverting the inherence of Convolutional Networks, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Industry
Artificial intelligence that more closely mimics the mind
Nara Logic’s AI platform uses recent neuroscience discoveries in its engine to provide more flexible brain-based solutions. Currently, this technology is serving health care organizations, consumer companies, manufacturers, and the federal government.
Facebook’s new big AI project is training its machine on users’ public videos
With access to an astronomical amount of public video data, Facebook is planning to launch AI-powered smart glasses along with other projects.
Fighting AI bias needs to be a key part of Biden’s civil rights agenda
The Algorithmic Accountability Act, which requires large tech companies to make impact and bias assessments, may be on its way.
Faster drug discovery through machine learning
DeepBAR is a new technique that can find the exact values of binding free energy, a necessary determinant of drug efficacy, orders of magnitude faster than previous methods.
Massively Parallel Graph Computation: From Theory to Practice
Adaptive Massively Parallel Computation (AMPC), augments the theoretical capabilities of MapReduce, providing a practical pathway to solve many graph problems up to 7x faster than the state-of-the-art approaches.
Scientists developed a clever way to detect Deepfakes by analyzing light reflections in the eyes
A surprisingly simple AI tool that effectively detected deepfake images generated by a StyleGAN2 model through the light reflections in the eyes.
Mobile+Edge
Implementing Real-Time Pose Estimation on Mobile Using Flutter
A short tutorial on a basic PoseNet MobileNet V1 application using TensorFlow Lite and Flutter.
A Spanish startup otherwise known for its voice-based digital assistant and predictive search now offers federated learning services.
eloquentarduino/EloquentTinyML: Eloquent interface to TensorFlow Lite for Microcontrollers
An Arduino library that simplifies the deployment of TensorFlow Lite for microcontrollers models to Arduino boards using the Arduino IDE.
Edge computing is brought into a practical oriented example, where an AI network is implemented on an ESP32 device.
Learning
LEAF: A Learnable Frontend for Audio Classification
An alternative method for crafting learnable spectrograms for audio understanding tasks. LEarnable Audio Frontend (LEAF) is a neural network that can be initialized to approximate mel filterbanks, and then be trained jointly with any audio classifier to adapt to the task at hand.
A Machine Learning Engineer’s Tutorial to Transfer Learning for Multi-class Image Segmentation…
A short hands-on tutorial on how to implement a multi-class image segmentation solution from a binary semantic segmentation base.
A Beginner's Guide to the CLIP Model
An accessible introduction to how OpenAI’s CLIP model works and its applications.
Fine-Tune Wav2Vec2 for English ASR in Hugging Face with 🤗 Transformers
A comprehensive tutorial on fine-tuning a pre-trained Wav2Vec2 checkpoint using Connectionist Temporal Classification (CTC) on the small Timit dataset.
Libraries & Code
cortexlabs/cortex: Deploy, manage, and scale machine learning models in production
A cloud native model serving platform for machine learning engineering teams
mikaelalafriz/lucid-sonic-dreams
A simple library that allows you to generate visuals for music input via generative adversarial networks.
Papers & Publications
Involution: Inverting the Inherence of Convolutional Networks for Visual Recognition
Abstract:
Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel-specific. Instead, we present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. We additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively.
Incremental Potential Contact: Intersection- and Inversion-free Large Deformation Dynamics
Abstract:
Contacts weave through every aspect of our physical world, from daily household chores to acts of nature. Modeling and predictive computation of these phenomena for solid mechanics is important to every discipline concerned with the motion of mechanical systems, including engineering and animation. Nevertheless, efficiently time-stepping accurate and consistent simulations of real-world contacting elastica remains an outstanding computational challenge. To model the complex interaction of deforming solids in contact we propose Incremental Potential Contact (IPC) – a new model and algorithm for variationally solving implicitly time-stepped nonlinear elastodynamics. IPC maintains an intersection- and inversion-free trajectory regardless of material parameters, time step sizes, impact velocities, severity of deformation, or boundary conditions enforced.
Constructed with a custom nonlinear solver, IPC enables efficient resolution of time-stepping problems with separate, user-exposed accuracy tolerances that allow independent specification of the physical accuracy of the dynamics and the geometric accuracy of surface-to-surface conformation. This enables users to decouple, as needed per application, desired accuracies for a simulation’s dynamics and geometry.
The resulting time stepper solves contact problems that are intersection free (and thus robust), inversion-free, efficient (at speeds comparable to or faster than available methods that lack both convergence and feasibility), and accurate (solved to user-specified accuracies). To our knowledge this is the first implicit time-stepping method, across both the engineering and graphics literature that can consistently enforce these guarantees as we vary simulation parameters.
In an extensive comparison of available simulation methods, research libraries and commercial codes we confirm that available engineering and computer graphics methods, while each succeeding admirably in custom-tuned regimes, often fail with instabilities, egregious constraint violations and/or inaccurate and implausible solutions, as we vary input materials, contact numbers and time step. We also exercise IPC across a wide range of existing and new benchmark tests and demonstrate its accurate solution over a broad sweep of reasonable time-step sizes and beyond (up to h=2s) across challenging large-deformation, large-contact stress-test scenarios with meshes composed of up to 2.3M tetrahedra and processing up to 498K contacts per time step. For applications requiring high-accuracy we demonstrate tight convergence on all measures. While, for applications requiring lower accuracies, e.g. animation, we confirm IPC can ensure feasibility and plausibility even when specified tolerances are lowered for efficiency.