Deep Learning Weekly: Issue #202
Quantum AI, Global AI Index, Embedded Machine Learning, PyTorch Autograd Engine, Templated Deep Learning, and more
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Google has unveiled a new Quantum AI campus aiming to build a useful quantum computer. This may have lots of interesting applications for AI research.
The Global AI Index is an index to benchmark nations on their level of investment, innovation and implementation of artificial intelligence.
Georgetown researchers used text generator GPT-3 to write misleading tweets about climate change and foreign affairs, and scaringly people found the posts persuasive.
Facebook announced that they are migrating all their production AI systems to PyTorch, and gives a nice overview of their 1,700 PyTorch-based inference models in full production.
The Beijing Academy of Artificial Intelligence launched the latest version of Wudao, a deep learning language model that the lab dubbed as the world’s largest ever, with 1.75 trillion parameters.
This article states that how China tackled the pandemic showed to the world how deep and specialized its health care data, algorithms, and A.I. research are.
Mobile & Edge
This course on Coursera teaches how to run deep neural networks and other complex ML algorithms on low-power devices like microcontrollers.
This article introduces dynamic regularization, a method used to train ML models in a federated learning context, i.e. when the data is decentralized among several nodes, typically IoT devices.
Having recently raised $75M in funding, Mythic AI launches the first chip based on analog computing. This bears watching, and they intend to submit results to industry’s performance benchmarks.
A security researcher at Google explains how to use deep learning to recover encryption keys from CPU power consumption traces.
This post dives deep into how the PyTorch engine to compute gradients with automatic differentiation works. It provides very clear explanations on simple examples.
Facebook introduced TextStyleBrush, an AI research project that can copy the style of text in a photo using just a single word, enabling for example to edit and replace text in images.
In this post, OVHCloud, a cloud provider, details how they monitor network devices and avoid overheat shutdowns using ML models.
Libraries & Code
This library extends basic PyTorch capabilities and makes available the latest distributed training techniques in the form of composable modules and easy to use APIs.
Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases by wrapping the functional APIs of existing frameworks. It currently supports Jax, TensorFlow, PyTorch, MXNet and Numpy.
Papers & Publications
A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head for each new task or dataset. In this work, we propose a task-agnostic vision-language system that accepts an image and a natural language task description and outputs bounding boxes, confidences, and text. The system supports a wide range of vision tasks such as classification, localization, question answering, captioning, and more. We evaluate the system's ability to learn multiple skills simultaneously, to perform tasks with novel skill-concept combinations, and to learn new skills efficiently and without forgetting.
Chip floorplanning is the engineering task of designing the physical layout of a computer chip. Despite five decades of research1, chip floorplanning has defied automation, requiring months of intense effort by physical design engineers to produce manufacturable layouts. Here we present a deep reinforcement learning approach to chip floorplanning. In under six hours, our method automatically generates chip floorplans that are superior or comparable to those produced by humans in all key metrics, including power consumption, performance and chip area. To achieve this, we pose chip floorplanning as a reinforcement learning problem, and develop an edge-based graph convolutional neural network architecture capable of learning rich and transferable representations of the chip. As a result, our method utilizes past experience to become better and faster at solving new instances of the problem, allowing chip design to be performed by artificial agents with more experience than any human designer. Our method was used to design the next generation of Google’s artificial intelligence (AI) accelerators, and has the potential to save thousands of hours of human effort for each new generation. Finally, we believe that more powerful AI-designed hardware will fuel advances in AI, creating a symbiotic relationship between the two fields.
Knowledge distillation is a popular technique for training a small student network to emulate a larger teacher model, such as an ensemble of networks. We show that while knowledge distillation can improve student generalization, it does not typically work as it is commonly understood: there often remains a surprisingly large discrepancy between the predictive distributions of the teacher and the student, even in cases when the student has the capacity to perfectly match the teacher. We identify difficulties in optimization as a key reason for why the student is unable to match the teacher. We also show how the details of the dataset used for distillation play a role in how closely the student matches the teacher -- and that more closely matching the teacher paradoxically does not always lead to better student generalization.