AUTHORS: Sharvin Shah, Jin Qu, Donald Rose
TL;DR: TaiChi is an open source library for few-shot NLP, designed for data scientists and software engineers who want to get some quick results or build proof-of-concept products but don’t have much experience with few-shot learning (FSL). The library abstracts complex FSL methods into Python objects that can be accessed through one or two lines of code, greatly reducing the hurdle to learn and use the latest FSL methods.
Background and Motivation
Tai Chi, well known as a Chinese martial art, emphasizes practicing “smart strength” like the leverage of joints to gain great power with minimal effort.
Interestingly, this philosophy fits perfectly into few-shot learning (FSL) research: using “smart tricks”, one strives to train models that show good performance using a small amount of data.
Over the last few years, we have seen great progress in FSL research, thanks to work that has been done in pre-training, meta-learning, data augmentation, and public benchmark datasets. Since data collection and labeling are often expensive and time-consuming, breakthroughs in FSL research have huge potential use cases in the industry.
However, while FSL is an active research area and has great potential for many applications, off-the-shelf and user-friendly libraries have not been readily available for data scientists or software engineers to do quick exploration.
Our Approach: TaiChi
In the spirit of the martial art Tai Chi and its use of intelligent methods to achieve good performance with less effort, we developed an FSL library and named it TaiChi in the hopes that it will help others’ model training in low-data scenarios.
Here is our system in a nutshell:
- Tai Chi philosophy applied to machine learning
- Get strong performance (model training) with minimal effort (less data)
- Result: one can train models even if only a few examples are available
- Get strong performance (model training) with minimal effort (less data)
- Result: one can train models even if only a few examples are available
- Our TaiChi is an FSL library anyone can use
- Open-source library for few-shot NLP
- Open-source library for few-shot NLP
- Beginner-friendly, yet powerful
- Doesn’t require users to have high degree of knowledge about FSL
- Designed for data scientists and software engineers who want to get some quick results or build proof-of-concept products, even if they don’t have much experience with FSL
- Can still plug in some datasets and play with great FSL methods.
- Doesn’t require users to have high degree of knowledge about FSL
- Designed for data scientists and software engineers who want to get some quick results or build proof-of-concept products, even if they don’t have much experience with FSL
- Can still plug in some datasets and play with great FSL methods.
- Removes hurdles so you can get going quicker
- Abstracts complex FSL methods into Python objects that can be accessed through just one or two lines of code
- Greatly reduces the hurdle to learn and use the latest FSL methods.
- Abstracts complex FSL methods into Python objects that can be accessed through just one or two lines of code
- Greatly reduces the hurdle to learn and use the latest FSL methods.
Deep Dive: TaiChi in Detail
To provide a better understanding of our approach, let’s take a closer look at how TaiChi works.
Methods: DNNC and USLP
Our current release, TaiChi 1.0, contains two main FSL methods: DNNC and USLP. These are mainly for few-shot intent classification.
Why does TaiChi 1.0 use these two methods? Here’s a quick refresher:
- In 2020, Zhang et al. proposed framing few-shot intent classification as natural language inference (NLI) between query utterances and examples in the training set, a method known as discriminative nearest neighbor classification or DNNC.
- Inspired by this work, we proposed to simplify the NLI-style classification pipeline to be the entailment prediction on the utterance semantic label pair (USLP). The semantic information in the labels can thus be infused into the classification process.
The figure below provides a quick comparison of standard intent classification with DNNC and USLP. Both DNNC and USLP are based on NLI-style classification, but while DNNC reframes classification as entailment prediction between query and utterances in the training set, USLP simplifies DNNC by trying to predict the entailment relationship of utterance and semantic labels.
Results and findings of using these two methods:
- Compared with DNNC, our proposed method is more efficient in both training and serving since it is based on the entailment between query utterance and labels instead of all training examples. (Remember the Tai Chi philosophy: more with less!)
- The DNNC method requires more than one example per intent, while the USLP approach does not have such a constraint.
- In one-shot experiments, the USLP method outperforms the traditional classification approach.
- Longer and semantically meaningful labels tend to benefit model performance; however, the benefit shrinks as more training data is available.
Please refer to our DNNC and USLP papers for more details.
Backbone Models
We are also sharing the backbone models for DNNC and USLP. The models are based on public pre-trained models from Huggingface and further tuned with the NLI dataset to make them adapted to NLI-style classification.
- nli-pretrained-roberta-base, English only model
- nli-pretrained-xlm-roberta-base, based on XLM-RoBERTa model, which supports 100 languages, can be used for multi/cross-lingual projects
Please refer to the NLI pre-training pipeline here (Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference) if you would like to pre-train a new model.
Data
We use the CLINC150 Dataset for benchmarks and tutorials.
The original data_small.json is sub-sampled and further processed.
Users can download the processed dataset from here.
Feature Checklist for TaiChi 1.0
- Pythonic API, “from taichi import few_shot_learning_method”
- Based on pyTorch and Huggingface transformers library
- Includes two recently published few-shot methods: DNNC and USLP
- Data sampling and error analysis API
- Examples on CLINC150 dataset for quick start
- Pre-trained English and multilingual transformer models and preprocessed CLINC150 dataset here.
The Big Picture
The TaiChi library serves as an API hub for various effective FSL methods proposed by the Salesforce Research team, which has done several FSL-related projects for research and application purposes.
The main contribution of this software: it provides a user-friendly API and allows engineers who don’t have experience with FSL to quickly play with it.
In the spirit of helping the wider community, TaiChi is an open-sourced work.
The Bottom Line
- TaiChi (current release: 1.0) is an open source library for few-shot NLP
- Designed for data scientists and software engineers who want to get some quick results or build proof-of-concept products but don’t have much experience with few-shot learning (FSL)
- The library abstracts complex FSL methods into Python objects accessible through one or two lines of code, greatly reducing the hurdle to learn and use the latest FSL methods
- Main contribution: provides a user-friendly API and lets engineers who don’t have experience with FSL quickly play with it.
- TaiChi is an open-sourced work, another example of how Salesforce research efforts are helping the wider community.
Explore More
Salesforce AI Research invites you to dive deeper into the concepts discussed in this blog post (links below). Connect with us on social media and our mailing list to get regular updates on this and other research projects.
- Code (Github): https://github.com/salesforce/TaiChi
- Contact: email Jin Qu at jqu@salesforce.com for any questions or feedback
- Other work: Check out Salesforce Research publications on FSL and other areas here.
Related Resources
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- RoBERTa: A Robustly Optimized BERT Pretraining Approach
- XLM-RoBERTa: Unsupervised Cross-lingual Representation Learning at Scale
- MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
- USLP: Few-Shot Intent Classification by Gauging Entailment Relationship Between Utterance and Semantic Label
- DNNC: Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
- CLINC150 Dataset
About the Authors
Sharvin Shah is a Research Engineer at Salesforce AI Research. He is interested in topics related to conversational AI, such as data augmentation for intent recognition and conversational language modeling.
Jin Qu is a Research Engineer at Salesforce AI Research. His work focuses on NLP products R&D, which includes few-shot learning, multi-/cross-lingual, and knowledge distillation.
Donald Rose is a Technical Writer at Salesforce AI Research. Specializing in content creation and editing, Dr. Rose works on multiple projects, including blog posts, video scripts, news articles, media/PR material, social media, writing workshops, and more. He also helps researchers transform their work into publications geared towards a wider audience.
Appendix: Terms and Definitions
A review of some key terms used in our discussion:
- FSL: Few-Shot Learning
- CLINC150: a benchmark dataset for few-shot intent detection
- DNNC: Discriminative Nearest Neighbor Few-Shot Intent Detection
- USLP: Utterance Semantic Label Pair