Deep Learning Weekly: Issue #186

Another AI researcher terminated by Google, no-code voice AI platform for MCUs, Apple's federated evaluation and learning system, the tech behind cinematic photos, and more.

Hey folks,

Before we jump into this week’s issue, I wanted to let you know that we’ve moved this newsletter and its archives over to Substack, which allows our team more flexibility in creating and delivering the week’s best in deep learning to your inbox. No changes needed on your end—you’ll still receive the same weekly letter moving forward. 

But stay tuned! We have big plans moving forward that will add new content and give you more options as a reader. Now, back to our regularly scheduled programming…

This week in deep learning we bring you a second AI researcher who says she was fired by Google, an EU report that warns that AI makes autonomous vehicles ‘highly vulnerable’ to attack, Google's Model Search, an open-source, domain-agnostic AutoML platform, and this IBM researcher, who found his name on two papers with which he had no connection.

You may also enjoy Apple's federated evaluation and learning system design, TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation, and more!

As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.

Until next week!


I helped build ByteDance's censorship machine

“I wasn't proud of it, and neither were my coworkers. But that's life in today's China.”

A Second AI Researcher Says She Was Fired by Google

Margaret Mitchell was the co-leader of a group investigating ethics in AI, alongside Timnit Gebru, who said she was fired in December.

The AI Research Paper Was Real. The ‘Coauthor’ Wasn't

An IBM researcher found his name on two papers with which he had no connection. A different paper listed a fictitious author by the name of "Bill Franks."

EU report warns that AI makes autonomous vehicles ‘highly vulnerable’ to attack

The dream of autonomous vehicles is that they can avoid human error and save lives, but a new European Union Agency for Cybersecurity (ENISA) report has found that autonomous vehicles are “highly vulnerable to a wide range of attacks” that could be dangerous for passengers, pedestrians, and people in other vehicles.

Boston Dynamics doesn't want you to shoot paintballs from Spot the robot

A new game will let you control the armed bot as it rampages through a gallery.

Math learning app Photomath raises $23 million as it reaches 220 million downloads

Photomath, the popular mobile app that helps you solve equations, has raised a $23 million Series B funding round led by Menlo Ventures.

Mobile + Edge

The First No-Code Voice AI Platform for Microcontrollers

Customers can create voice models within their browsers instantly, using Picovoice Console. Once the models are trained, they can be downloaded and loaded onto a microcontroller using Picovoice Shepherd, without any embedded expertise.

Translate text with ML Kit on Android

Use ML Kit’s On-device API to translate text.

Recogni raises $48.9 million for AI-powered perception chips

Recogni, a startup designing an AI-powered vision recognition module for autonomous vehicles, today announced it raised $48.9 million.

How Edge AI Chipsets Will Make AI Tasks More Efficient

Edge chipsets are here to stay and improve how people use and produce data. With the pairing of AI, every task, no matter how small, becomes simpler.


Mastering Atari with Discrete World Models

This post covers DreamerV2, the first world model-based Reinforcement Learning agent to achieve top-level performance on the Atari benchmark, learning general representations from images to discover successful behaviors in latent space.

Apple Reveals Design of Its On-Device ML System for Federated Evaluation and Tuning

Apple has laid out the design characteristics of a new generic system that enables federated evaluation and tuning (FE&T) systems on end-user devices.

Introducing Model Search: An Open Source Platform for Finding Optimal ML Models

Google announced the release of Model Search, an open-source, domain-agnostic AutoML platform that helps researchers to efficiently and automatically develop optimal ML models.

The Technology Behind Cinematic Photos

Check out the new Cinematic photos feature in Google Photos, where ML is used to add simulated camera motion and parallax to a single-frame still photo, transforming a 2D photo into a more immersive 3D scene.


[GitHub] microsoft/responsible-ai-widgets/

This project provides responsible AI user interfaces for Fairlearn, interpret-community, and Error Analysis, as well as foundational building blocks that they rely on.

[GitHib] huawei-noah/noah-research/tree/master/HEBO

Bayesian optimsation library developed by Huawei Noah's Ark Decision Making and Reasoning (DMnR) lab. The winning submission to the NeurIPS 2020 Black-Box Optimisation Challenge.


AIST Dance Video Database (AIST Dance DB)

AIST Dance Video Database (AIST Dance DB) is a shared database containing original street dance videos with copyright-cleared dance music. Project website here.


TapNet: The Design, Training, Implementation, and Applications of a Multi-Task Learning CNN for Off-Screen Mobile Input

Abstract: To make off-screen interaction without specialized hardware practical, we investigate using deep learning methods to process the common built-in IMU sensor (accelerometers and gyroscopes) on mobile phones into a useful set of one-handed interaction events. We present the design, training, implementation and applications of TapNet, a multi-task network that detects tapping on the smartphone. With phone form factor as auxiliary information, TapNet can jointly learn from data across devices and simultaneously recognize multiple tap properties, including tap direction and tap location. We developed two datasets consisting of over 135K training samples, 38K testing samples, and 32 participants in total. Experimental evaluation demonstrated the effectiveness of the TapNet design and its significant improvement over the state of the art. Along with the datasets, (this https URL), and extensive experiments, TapNet establishes a new technical foundation for off-screen mobile input.

TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

Abstract: U-Net based convolutional neural networks with deep feature representation and skip-connections have significantly boosted the performance of medical image segmentation. In this paper, we study the more challenging problem of improving efficiency in modeling global contexts without losing localization ability for low-level details. TransFuse, a novel two-branch architecture is proposed, which combines Transformers and CNNs in a parallel style. With TransFuse, both global dependency and low-level spatial details can be efficiently captured in a much shallower manner. Besides, a novel fusion technique - BiFusion module is proposed to fuse the multi-level features from each branch. TransFuse achieves the newest state-of-the-arts on polyp segmentation task, with 20% fewer parameters and the fastest inference speed at about 98.7 FPS