Deep Learning Weekly: Issue #186
Another AI researcher terminated by Google, no-code voice AI platform for MCUs, Apple's federated evaluation and learning system, the tech behind cinematic photos, and more.
Before we jump into this week’s issue, I wanted to let you know that we’ve moved this newsletter and its archives over to Substack, which allows our team more flexibility in creating and delivering the week’s best in deep learning to your inbox. No changes needed on your end—you’ll still receive the same weekly letter moving forward.
But stay tuned! We have big plans moving forward that will add new content and give you more options as a reader. Now, back to our regularly scheduled programming…
This week in deep learning we bring you a second AI researcher who says she was fired by Google, an EU report that warns that AI makes autonomous vehicles ‘highly vulnerable’ to attack, Google's Model Search, an open-source, domain-agnostic AutoML platform, and this IBM researcher, who found his name on two papers with which he had no connection.
You may also enjoy Apple's federated evaluation and learning system design, TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
“I wasn't proud of it, and neither were my coworkers. But that's life in today's China.”
Margaret Mitchell was the co-leader of a group investigating ethics in AI, alongside Timnit Gebru, who said she was fired in December.
An IBM researcher found his name on two papers with which he had no connection. A different paper listed a fictitious author by the name of "Bill Franks."
The dream of autonomous vehicles is that they can avoid human error and save lives, but a new European Union Agency for Cybersecurity (ENISA) report has found that autonomous vehicles are “highly vulnerable to a wide range of attacks” that could be dangerous for passengers, pedestrians, and people in other vehicles.
A new game will let you control the armed bot as it rampages through a gallery.
Photomath, the popular mobile app that helps you solve equations, has raised a $23 million Series B funding round led by Menlo Ventures.
Mobile + Edge
Customers can create voice models within their browsers instantly, using Picovoice Console. Once the models are trained, they can be downloaded and loaded onto a microcontroller using Picovoice Shepherd, without any embedded expertise.
Use ML Kit’s On-device API to translate text.
Recogni, a startup designing an AI-powered vision recognition module for autonomous vehicles, today announced it raised $48.9 million.
Edge chipsets are here to stay and improve how people use and produce data. With the pairing of AI, every task, no matter how small, becomes simpler.
This post covers DreamerV2, the first world model-based Reinforcement Learning agent to achieve top-level performance on the Atari benchmark, learning general representations from images to discover successful behaviors in latent space.
Apple has laid out the design characteristics of a new generic system that enables federated evaluation and tuning (FE&T) systems on end-user devices.
Google announced the release of Model Search, an open-source, domain-agnostic AutoML platform that helps researchers to efficiently and automatically develop optimal ML models.
Check out the new Cinematic photos feature in Google Photos, where ML is used to add simulated camera motion and parallax to a single-frame still photo, transforming a 2D photo into a more immersive 3D scene.
This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.
Bayesian optimsation library developed by Huawei Noah's Ark Decision Making and Reasoning (DMnR) lab. The winning submission to the NeurIPS 2020 Black-Box Optimisation Challenge.
AIST Dance Video Database (AIST Dance DB) is a shared database containing original street dance videos with copyright-cleared dance music. Project website here.
Abstract: To make off-screen interaction without specialized hardware practical, we investigate using deep learning methods to process the common built-in IMU sensor (accelerometers and gyroscopes) on mobile phones into a useful set of one-handed interaction events. We present the design, training, implementation and applications of TapNet, a multi-task network that detects tapping on the smartphone. With phone form factor as auxiliary information, TapNet can jointly learn from data across devices and simultaneously recognize multiple tap properties, including tap direction and tap location. We developed two datasets consisting of over 135K training samples, 38K testing samples, and 32 participants in total. Experimental evaluation demonstrated the effectiveness of the TapNet design and its significant improvement over the state of the art. Along with the datasets, (this https URL), and extensive experiments, TapNet establishes a new technical foundation for off-screen mobile input.
Abstract: U-Net based convolutional neural networks with deep feature representation and skip-connections have significantly boosted the performance of medical image segmentation. In this paper, we study the more challenging problem of improving efficiency in modeling global contexts without losing localization ability for low-level details. TransFuse, a novel two-branch architecture is proposed, which combines Transformers and CNNs in a parallel style. With TransFuse, both global dependency and low-level spatial details can be efficiently captured in a much shallower manner. Besides, a novel fusion technique - BiFusion module is proposed to fuse the multi-level features from each branch. TransFuse achieves the newest state-of-the-arts on polyp segmentation task, with 20% fewer parameters and the fastest inference speed at about 98.7 FPS