As we say a fond farewell to summer (bummer!), let’s look back and review some of the stellar work reported on by Salesforce AI researchers during the past few months. (For more details, we encourage you to click the link for each project to read the full blog post.)
Open Vocabulary Object Detection with Pseudo Bounding-Box Labels: Towards a Universal Object Detector
Most AI object detection methods work only on limited object categories, due to the human effort required for bounding-box annotations of training data. Chen Xing’s team developed a new method that automatically generates pseudo bounding-box annotations of diverse objects from large-scale image-caption pairs, removing the bottleneck caused by the need for human labeling.
Experimental results show that their method outperforms the state-of-the-art open vocabulary object detector.
And here’s the “Big Picture” result: their method’s AI-generated pseudo bounding-box labels, plus its strong generalization performance on novel datasets, brings us closer to the dream of a universal object detector.
Previous methods (left) rely on human-provided box-level labels of predefined base classes during training and try to generalize to objects of novel classes during inference. Our method (right) generates pseudo bounding-box labels from large-scale image-caption pairs (no humans required), then uses these pseudo labels to improve our open vocabulary object detector.
AI Coding with CodeRL: Toward Mastering Program Synthesis with Deep Reinforcement Learning
CodeRL is a new framework for program synthesis (also known as code generation) through holistic integration of pretrained language models and deep reinforcement learning. By utilizing unit test feedback as part of model training and inference, and integrating with an improved CodeT5 model, CodeRL achieves state-of-the-art results on competition-level programming tasks.
To use an analogy to chess, CodeRL is not just successfully playing the game (writing code that works), it’s competing at the master level.
The GIF below gives a high-level overview of how CodeRL works.
OmniXAI: Making Explainable AI Easy for Any Data, Any Models, Any Tasks
OmniXAI – short for Omni eXplainable AI – is designed to address many of the pain points in explaining decisions made by AI models. This open-source library aims to provide data scientists, machine learning engineers, and researchers with a one-stop Explainable AI (XAI) solution to analyze, debug, and interpret their AI models in a wide range of tasks and applications. OmniXAI’s powerful features and integrated framework make it a major addition to the burgeoning field of XAI.
At this point, you may be asking: Why XAI?
The answer: many AI models, especially those based on deep neural networks, are essentially black-boxes that lack explainability. This may inhibit their adoption in critical applications and hamper people’s trust in AI systems. XAI was developed to address these challenges, to explain how AI models “think” — in short, XAI techniques (“explainers”) can reveal the reasoning behind the decisions AI models make, opening up the black box to see what’s inside. Such explanations can improve the transparency and persuasiveness of AI systems, and help AI developers improve model performance.
The table below shows how OmniXAI gives its users a much larger palette of explanation methods to choose from, compared with other XAI libraries.
Meet Merlion: An End-to-End Easy-to-Use Machine Learning Library for Time Series Applications
What OmniXAI is to XAI, Merlion is to time series: a powerful one-stop solution designed to let users solve problems for a wide range of tasks.
Time series data is a critical source of insights for many applications (IT Ops, Quality Management, Financial Analytics, and Inventory & Sales Management, to name a few). However, while a variety of dedicated packages and software exist, engineers and researchers still face several daunting challenges when they try to experiment with or benchmark time-series analysis algorithms. The steep learning curve for disparate programming interfaces for different models – as well as the process of selecting and training a model, data compatibility requirements, and intricate evaluation metrics – limit the accessibility of such packages for a broad audience of potential users.
To address these issues, and combine several key functions into one tool, Huan Wang’s team developed Merlion: a Python library for time series intelligence.
Merlion provides an end-to-end machine learning framework that includes loading and transforming data, building and training models, post-processing model outputs, and evaluating model performance. It supports various time series learning tasks, including forecasting, anomaly detection, and change-point detection for both univariate and multivariate time series. The Merlion library helps solve a range of problems by providing engineers and researchers a one-stop solution to rapidly develop models for their specific time series needs, and benchmark them across multiple time-series datasets. Instead of having to learn and deploy multiple tools, you can do it all within a single, powerful framework.
As a complete end-to-end solution for many ML time series tasks, Merlion provides several key benefits:
- All-in-one design: comes with more built-in features/functions than other ML tools (as shown in the table below)
- A unified interface across all models and datasets
- Pre- and post-processing layers
- Anomaly score calibration to improve interpretability
- AutoML for hyperparameter tuning and model selection
- An evaluation framework that simulates model retraining
- Support for ensembles (combining multiple models)
- An easy-to-use visualization module.
ETSformer: Exponential Smoothing Transformers for Time-Series Forecasting
ETSformer, in a nutshell = transformers, transformed!
Gerald Woo’s team developed a new time-series forecasting model, called ETSformer, which leverages the power of two frameworks. By combining the classical intuition of seasonal-trend decomposition and exponential smoothing with modern transformers – as well as introducing novel exponential smoothing and frequency attention mechanisms – ETSformer gets some impressive results:
- Achieves state-of-the-art performance over six real-world time-series datasets from a range of application domains – including traffic forecasting, weather forecasting, and financial time-series forecasting
- Beats baselines in 22 out of 24 settings across various real-world datasets, and across different forecasting lengths (how far ahead into the future the model forecasts).
TaiChi: Open Source Library for Few-Shot NLP
TaiChi is an open source library for few-shot NLP, designed for data scientists and software engineers who want to get some quick results or build proof-of-concept products but don’t have much experience with few-shot learning (FSL).
The main things to know about TaiChi:
- Current release 1.0 is an open source library for few-shot NLP. Being an open-sourced work, TaiChi is another example of how Salesforce research efforts are helping the wider community.
- The library abstracts complex FSL methods into Python objects accessible through just one or two lines of code, greatly reducing the hurdle to learn and use the latest FSL methods.
- Primary contribution: provides a user-friendly API and lets engineers who don’t have experience with FSL quickly play with it and build knowledge.
For more details about the above projects, and to learn about other groundbreaking projects at Salesforce AI Research, please visit salesforceairesearch.com.