Discover more from Deep Learning Weekly
Deep Learning Weekly: Issue #243
Meta AI's collaborative data collection platform called Mephisto, NVIDIA's instant NeRF, a library for deep reinforcement learning in finance, and more
This week in deep learning, we bring you Meta AI's collaborative data collection platform called Mephisto, NVIDIA's instant NeRF, a library for deep reinforcement learning in finance, and a paper on autoregressive image generation using residual quantization.
As always, happy reading and hacking. If you have something you think should be in next week's issue, find us on Twitter: @dl_weekly.
Until next week!
Meta AI introduces Mephisto, a new open, collaborative way to collect, share, and iterate on best practices for collecting data to train AI models.
NVIDIA developed an approach called Instant NeRF, a neural rendering model that learns a high-resolution 3D scene in seconds — and can render images of that scene in a few milliseconds.
By using machine learning to precisely tune the growth of blue-green algae, also known as cyanobacteria, the Texas A&M University team is able to produce over 43 grams per square meter a day in an outdoor experimental setup.
Seamus Anderson and his colleagues at Curtin University reported a meteorite in the remote Australian outback—one that once followed an ellipse between the orbits of Venus and Jupiter—using two drones and machine learning.
A survey of over 500 U.S. machine learning practitioners uncovered challenges related to people, processes, and tools that are causing friction during the complex process of developing ML.
Nemesysco is in the process of spinning off a new voice analytics company called Emotion Logic, which will use AI to detect and measure human emotions and improve interactions in the Metaverse.
Swedish startup anch.AI announced the availability of its Ethical Governance Platform, bundling a number of useful tools for companies seeking responsible artificial intelligence governance.
An article describing how to compare the results of two (or more) Machine Learning experiments through the graphical interface provided by the Comet platform.
A discussion on how to deploy an age detection TF model (in the .h5 format) to Heroku and then use it to make predictions on an Android device.
Alibi Detect is an open source Python library focused on outlier, adversarial, and drift detection.
A technical article on the ML pipeline tool called Orchest, which does not require any third-party integration or DAGs.
Edge Impulse announces the official release of a brand new approach to run real-time object detection models (30x faster than MobileNet SSD) on constrained devices: Faster Objects, More Objects (FOMO).
Google describes how the automatic suggestion generation in Google Docs was enabled using an ML model that comprehends document text and generates a 1-2 sentence description of the content.
A technical tutorial and integration announcement on the Decision Transformer, an Offline Reinforcement Learning method.
An article on the types and examples of machine learning bias, along with the ways to measure bias using SuperAnnotate.
Libraries & Code
FinRL is the first open-source project to explore the great potential of deep reinforcement learning in finance. This library is for pipelining a trading strategy using deep reinforcement learning.
Sionna is an open-source Python library for link-level simulations of digital communication systems built on top of the open-source software library TensorFlow for machine learning.
Papers & Publications
For autoregressive (AR) modeling of high-resolution images, vector quantization (VQ) represents an image as a sequence of discrete codes. A short sequence length is important for an AR model to reduce its computational costs to consider long-range interactions of codes. However, we postulate that previous VQ cannot shorten the code sequence and generate high-fidelity images together in terms of the rate-distortion trade-off. In this study, we propose the two-stage framework, which consists of Residual-Quantized VAE (RQ-VAE) and RQ-Transformer, to effectively generate high-resolution images. Given a fixed codebook size, RQ-VAE can precisely approximate a feature map of an image and represent the image as a stacked map of discrete codes. Then, RQ-Transformer learns to predict the quantized feature vector at the next position by predicting the next stack of codes. Thanks to the precise approximation of RQ-VAE, we can represent a 256×256 image as 8×8 resolution of the feature map, and RQ-Transformer can efficiently reduce the computational costs. Consequently, our framework outperforms the existing AR models on various benchmarks of unconditional and conditional image generation. Our approach also has a significantly faster sampling speed than previous AR models to generate high-quality images.
Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data. In this paper, we explore more challenging exemplar-based high-resolution portrait style transfer by introducing a novel DualStyleGAN with flexible control of dual styles of the original face domain and the extended artistic portrait domain. Different from StyleGAN, DualStyleGAN provides a natural way of style transfer by characterizing the content and style of a portrait with an intrinsic style path and a new extrinsic style path, respectively. The delicately designed extrinsic style path enables our model to modulate both the color and complex structural styles hierarchically to precisely pastiche the style example. Furthermore, a novel progressive fine-tuning scheme is introduced to smoothly transform the generative space of the model to the target domain, even with the above modifications on the network architecture. Experiments demonstrate the superiority of DualStyleGAN over state-of-the-art methods in high-quality portrait style transfer and flexible style control.