Deep Learning Weekly: Issue #243
Meta AI's collaborative data collection platform called Mephisto, NVIDIA's instant NeRF, a library for deep reinforcement learning in finance, and more
Hey Folks,
This week in deep learning, we bring you Meta AI's collaborative data collection platform called Mephisto, NVIDIA's instant NeRF, a library for deep reinforcement learning in finance, and a paper on autoregressive image generation using residual quantization.
You may also enjoy an open-sourced outlier, adversarial, and drift detection tool, auto-generated summaries in Google Docs, decision transformers in HuggingFace, a paper on DualStyleGAN, and more!
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Industry
Introducing Mephisto: A new platform for more open, collaborative data collection
Meta AI introduces Mephisto, a new open, collaborative way to collect, share, and iterate on best practices for collecting data to train AI models.
NeRF Research Turns 2D Photos Into 3D Scenes
NVIDIA developed an approach called Instant NeRF, a neural rendering model that learns a high-resolution 3D scene in seconds — and can render images of that scene in a few milliseconds.
AI-Directed Algae Blooms Boost Biofuel Prospects
By using machine learning to precisely tune the growth of blue-green algae, also known as cyanobacteria, the Texas A&M University team is able to produce over 43 grams per square meter a day in an outdoor experimental setup.
Starfall: Finding a Meteorite with Drones and AI
Seamus Anderson and his colleagues at Curtin University reported a meteorite in the remote Australian outback—one that once followed an ellipse between the orbits of Venus and Jupiter—using two drones and machine learning.
Comet releases ML practitioner survey highlighting the current industry challenges
A survey of over 500 U.S. machine learning practitioners uncovered challenges related to people, processes, and tools that are causing friction during the complex process of developing ML.
Nemesysco spinoff unveils emotional detection and AI tools for metaverse
Nemesysco is in the process of spinning off a new voice analytics company called Emotion Logic, which will use AI to detect and measure human emotions and improve interactions in the Metaverse.
Anch.AI launches free Ethical Governance Platform to encourage adoption of responsible AI practices
Swedish startup anch.AI announced the availability of its Ethical Governance Platform, bundling a number of useful tools for companies seeking responsible artificial intelligence governance.
MLOps
How to Compare Two or More Experiments in Comet
An article describing how to compare the results of two (or more) Machine Learning experiments through the graphical interface provided by the Comet platform.
Deploying TF Models on Heroku for Android Apps
A discussion on how to deploy an age detection TF model (in the .h5 format) to Heroku and then use it to make predictions on an Android device.
Alibi Detect is an open source Python library focused on outlier, adversarial, and drift detection.
A guide to Orchest for building ML pipelines
A technical article on the ML pipeline tool called Orchest, which does not require any third-party integration or DAGs.
Learning
Announcing FOMO (Faster Objects, More Objects)
Edge Impulse announces the official release of a brand new approach to run real-time object detection models (30x faster than MobileNet SSD) on constrained devices: Faster Objects, More Objects (FOMO).
Auto-generated Summaries in Google Docs
Google describes how the automatic suggestion generation in Google Docs was enabled using an ML model that comprehends document text and generates a 1-2 sentence description of the content.
Introducing Decision Transformers on Hugging Face
A technical tutorial and integration announcement on the Decision Transformer, an Offline Reinforcement Learning method.
Bias in machine learning: Types and examples
An article on the types and examples of machine learning bias, along with the ways to measure bias using SuperAnnotate.
Libraries & Code
AI4Finance-Foundation/FinRL: FinRL
FinRL is the first open-source project to explore the great potential of deep reinforcement learning in finance. This library is for pipelining a trading strategy using deep reinforcement learning.
Sionna: An Open-Source Library for Next-Generation Physical Layer Research
Sionna is an open-source Python library for link-level simulations of digital communication systems built on top of the open-source software library TensorFlow for machine learning.
Papers & Publications
Autoregressive Image Generation using Residual Quantization
Abstract:
For autoregressive (AR) modeling of high-resolution images, vector quantization (VQ) represents an image as a sequence of discrete codes. A short sequence length is important for an AR model to reduce its computational costs to consider long-range interactions of codes. However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off. In this study, we propose the two-stage framework, which consists of Residual-Quantized VAE (RQ-VAE) and RQ-Transformer, to effectively generate high-resolution images. Given a fixed codebook size, RQ-VAE can precisely approximate a feature map of an image and represent the image as a stacked map of discrete codes. Then, RQ-Transformer learns to predict the quantized feature vector at the next position by predicting the next stack of codes. Thanks to the precise approximation of RQ-VAE, we can represent a 256×256 image as 8×8 resolution of the feature map, and RQ-Transformer can efficiently reduce the computational costs. Consequently, our framework outperforms the existing AR models on various benchmarks of unconditional and conditional image generation. Our approach also has a significantly faster sampling speed than previous AR models to generate high-quality images.
Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
Abstract:
Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data. In this paper, we explore more challenging exemplar-based high-resolution portrait style transfer by introducing a novel DualStyleGAN with flexible control of dual styles of the original face domain and the extended artistic portrait domain. Different from StyleGAN, DualStyleGAN provides a natural way of style transfer by characterizing the content and style of a portrait with an intrinsic style path and a new extrinsic style path, respectively. The delicately designed extrinsic style path enables our model to modulate both the color and complex structural styles hierarchically to precisely pastiche the style example. Furthermore, a novel progressive fine-tuning scheme is introduced to smoothly transform the generative space of the model to the target domain, even with the above modifications on the network architecture. Experiments demonstrate the superiority of DualStyleGAN over state-of-the-art methods in high-quality portrait style transfer and flexible style control.