|March 2 · Issue #75 · View online |
| Google Cloud Platform Blog: Cloud TPU Machine Learning Accelerators Now Available in Beta |
Starting today, Cloud TPUs are available in beta on Google Cloud Platform (GCP) to help machine learning (ML) experts train and run their ML models more quickly. Built with four custom ASICs, each Cloud TPU packs up to 180 teraflops of floating-point performance and 64 GB of high-bandwidth memory onto a single board. Usage is billed by the second at the rate of $6.50 USD per Cloud TPU per hour.
| JupyterLab is Ready for Users – Jupyter Blog |
JupyterLab is popular web-based notebook tool for researchers and data scientist. With the latest release, JupyterLab is now ready for daily use.
| Artificial Intelligence Faces Reproducibility Crisis |
The social sciences and medicine are not alone in that the field of artificial intelligence (AI) is faced with a similar replication crisis. Unpublished codes and a sensitivity to training conditions have made it difficult for AI researchers to reproduce many key results. That is leading to a new conscientiousness about research methods and publication protocols.
| Preparing for Malicious Uses of AI |
A fascinating survey of potential security threats from malicious uses of artificial intelligence technologies. The authors lay out ways to better predict, prevent and mitigate these threats.
| China Overtakes US in AI Startup Funding With a Focus on Facial Recognition and Chips |
The latest comes from technology analysts CB Insights, which reports that China has overtaken the US in the funding of AI startups. The country accounted for 48 percent of the world’s total AI startup funding in 2017, compared to 38 percent for the US.
| The AI Talent Shortage |
A survey of the AI talent pool and how shallow it really is concluding that there aren’t enough people for every good project that can be attempted either academic, or something that if it works, can save the company $1M.
| Machine Learning Crash Course | Google Developers |
Google releases their machine learning educational material that has previous been only been available for internal training.
| Human-centered Machine Learning: a Machine-in-the-loop Approach |
A thoughtful post on the human-centered machine learning based on a framework that categorizes tasks the two dimensions of ‘difficulty for humans’ and 'delegability to machines’.
| Kernels, Polysemy and AI |
This post explores the difference between the two types of kernels in AI. The first use of “kernel” refers to a small window that one slides over an n dimensional surface, such as a 2D image, in order to detect features. The second kernel, β kernel, is an elemental building block of a Machine Learning model from which larger models are composed.
| Accelerating I/O bound deep learning – RiseML Blog |
An increasing trend is that pre-processing and especially reading the training data from disk becomes the bottleneck. This is caused by multiple factors, including faster GPUs, more efficient model architectures, and larger datasets, especially for video and image processing. As a result, the GPUs sit idle a lot of time, waiting for the next batch of data to work on. After optimising the pre-processing pipeline, I/O often becomes the next bottleneck.
| Proving Generalization of Deep Nets via Compression |
This post introduces an elementary compression-based framework for proving generalization bounds. It shows that deep nets are highly noise stable, and consequently, compressible. The framework also gives easy proofs of some papers that appeared in the past year.
| Google-Landmarks: A New Dataset and Challenge for Landmark Recognition |
Google-Landmarks is the largest worldwide dataset for recognition of human-made and natural landmarks. As research begins to focus more on fine-grained and instance-level recognition problems – instead of recognizing general entities such as buildings, mountains and (of course) cats, many researchers are designing machine learning algorithms capable of identifying the Eiffel Tower, Mount Fuji or Persian cats. This large annotated dataset could provide a significant boost for research in this area.
| Neural Voice Cloning with a Few Samples - Baidu Research |
Baidu’s Deep Voice project focuses on teaching machines to generate speech from text that sound more human-like.
Beyond single-speaker speech synthesis, this new research demonstrates that a single system could learn to reproduce thousands of speaker identities, with less than half an hour of training data for each speaker. This capability is enabled by learning shared and discriminative information from speakers.
| Adversarial Examples that Fool both Human and Computer Vision |
Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, the authors create the first adversarial examples designed to fool humans, by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by modifying models to more closely match the initial processing of the human visual system.
| Machine Theory of Mind |
Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans’ ability to represent the mental states of others, including their desires, beliefs, and intentions. We propose to train a machine to build such models too. We design a Theory of Mind neural network – a ToMnet – which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone. Through this process, it acquires a strong prior model for agents’ behaviour, as well as the ability to bootstrap to richer predictions about agents’ characteristics and mental states using only a small number of behavioural observations.
| L4: Practical Loss-based Stepsize Adaptation for Deep Learning |
The authors propose a step size adaptation scheme for stochastic gradient descent. It operates directly with the loss function and rescales the gradient in order to make fixed predicted progress on the loss. They demonstrate its capabilities by strongly improving the performance of Adam and Momentum optimizers. The enhanced optimizers with default hyperparameters consistently outperform their constant step size counterparts, even the best ones, without a measurable increase in computational cost.