Deep Learning Weekly: Issue #187
Come work with us! We're accepting applications for an Associate Editor. Plus, detecting defects in manufactured products with DL, debugging neural networks, and a "Holistic" pipeline for on-device ML
Before we jump into to this week’s newsletter, we have a favor to ask…come work with us! Deep Learning Weekly is seeking an Associate Editor. This is a part-time/contract position, perfect for someone who wants to share their passion for deep learning with the world. If you’re interested and think you’d be a good fit, we’d love to hear from you.
Now, back to the best news from the week in deep learning…
This week in deep learning we bring you Amazon's computer vision service to detect defects in manufactured products, Microsoft’s Azure Percept platform for bringing more of its Azure AI services to the edge, MIT’s student-run Driverless organization that partners with industry collaborators to develop and test autonomous technologies in real-world racing scenarios, and some practical tips for building and debugging neural networks.
You may also enjoy the official PyTorch package for the discrete VAE used for Open AI’s DALL·E, this PyTorch library with implementations of representative GANs for conditional/unconditional image generation and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
It seems like a nice idea in theory but it’s a tiny bit creepy as well.
Amazon recently announced the general availability of Amazon Lookout for Vision, a cloud service that analyzes images using computer vision to spot product or process defects and anomalies in manufactured goods.
Leveraging research done on campus, student-run MIT Driverless partners with industry collaborators to develop and test autonomous technologies in real-world racing scenarios.
After decades of staying out of industrial policy, a Pentagon-appointed commission recommends more spending on research and support for US chip makers.
AI programs that analyze language have difficulty gauging context. Words such as “black,” “white,” and “attack" can have different meanings.
The Trevor Project, America’s hotline for LGBT youth, is turning to a GPT-2-powered chatbot to help troubled teenagers—but it’s setting strict limits.
Mobile + Edge
Microsoft recently announced Azure Percept, its new hardware and software platform for bringing more of its Azure AI services to the edge.
One expert says caching content at the edge and closer to its users prevents more carbon emissions.
A look at Holistic Tracking — a new ML solution in MediaPipe that optimally integrates three widely-used machine vision capabilities.
Lyra is a new high-quality, very low-bitrate speech codec that leverages advances in ML to make voice communication possible even on the slowest networks.
A group of researchers make an effort to bring together causality and machine learning research programs, delineate implications of causality for machine learning, and propose critical areas for future research.
In this post, the author highlights a few steps of their mental process when it comes to building and debugging neural networks.
A research team from Facebook AI has proposed a Unified Transformer (UniT) encoder-decoder model that jointly trains on multiple tasks across different modalities and achieves strong performance on seven tasks with a unified set of model parameters.
A team from Microsoft and Université de Montréal proposes a new mathematical framework that uses measure theory and integral operators to achieve the goal of quantifying the regularity of the attention operation.
This is the official PyTorch package for the discrete VAE used for DALL·E.
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.
Abstract: Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and image tokens as a single stream of data. With sufficient data and scale, our approach is competitive with previous domain-specific models when evaluated in a zero-shot fashion.
Abstract: This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language.