OmniXAI: Making Explainable AI Easy for Any Data, Any Models, Any Tasks

Steven Hoi

Donald Rose

June 14, 2022 14 min read

TL;DR:OmniXAI (short for Omni eXplainable AI) is designed to address many of the
pain points in explaining decisions made by AI models. This open-source library
aims to provide data scientists, machine learning engineers, and researchers
with a one-stop Explainable AI (XAI) solution to analyze, debug, and interpret
their AI models for various data types in a wide range of tasks and
applications. OmniXAI’s powerful features and integrated framework make it a
major addition to the burgeoning field of XAI.

——————————————————————————–

Background and Motivation: The Need for Explainable AI

With the rapidly growing adoption of AI models in real-world applications, AI
decision making can potentially have a huge societal impact, especially for
application domains such as healthcare, education, and finance. However, many AI
models, especially those based on deep neural networks, effectively work as
black-box models that lack explainability. This works to inhibit their adoption
in certain critical applications (even when their use would be beneficial) and
hamper people’s trust in AI systems.

This limitation – the opacity of many AI systems – led to the current surge in
interest in Explainable AI (often referred to as XAI), an emerging field
developed to address these challenges. XAI aims to explain how an AI model
“thinks” — that is, XAI techniques (“explainers”) aim to reveal the reasoning
behind the decision(s) that an AI model has made, opening up the black box to
see what’s inside. Such explanations can improve the transparency and
persuasiveness of AI systems, as well as help AI developers debug and improve
model performance.

Several popular techniques for doing XAI have emerged in the field, such as LIME
(Local Interpretable Model-agnostic Explanations), SHAP (Shapley Additive
Explanations), and Integrated Gradients (IG is XAI in the domain of deep neural
networks). Each method or “flavor” of XAI has their special abilities and
features.

Limitations of Other XAI Libraries

Most existing XAI libraries support limited data types and models. For example,
InterpretML only supports tabular data, while Captum only supports PyTorch
models for images and texts.

In addition, different libraries have quite different interfaces, making it
inconvenient for data scientists or ML developers to switch from one XAI library
to another when seeking a new explanation method.

Also, a visualization tool is essential for users to examine and compare
explanations, but most existing libraries don’t provide a convenient tool for
visualizing and analyzing explanations.

OmniXAI: One-Stop ML Library Makes XAI Easy for All

We developed an open-source Python library called OmniXAI (short for Omni
eXplainable AI), offering an “omni-directional” multifaceted approach to XAI
with comprehensive interpretable machine learning (ML) capabilities designed to
address many of the pain points in understanding and interpreting decisions made
by ML models in practice.

OmniXAI serves as a one-stop comprehensive library that makes explainable AI
easy for data scientists, ML researchers, and practitioners who need
explanations for any type of data, model, and explanation method at different
stages of the ML process (such as data exploration, feature engineering, model
development, evaluation, and decision making).

As the name implies, OmniXAI combines several elements into one. In the
discussion below, we’ll see how our approach allows users to take advantage of
features present in several popular explanation methods, all within a single
framework.

Deep Dive

Unique Features of OmniXAI

As shown in the figure below, OmniXAI offers a one-stop solution for analyzing
different stages in a standard ML pipeline in real-world applications, and
provides a comprehensive family of explanation methods.

The explanation methods can be categorized into “model-agnostic” and
“model-specific”:

* “Model-agnostic” means a method can explain the decisions made by a black-box
model without knowing the model details.
* “Model-specific” means a method requires some knowledge about the model to
generate explanations — for instance, whether the model is a tree-based model
or a neural network.

Typically, there are two types of explanations — local and global:

* Local explanation indicates why a particular decision was made.
* Global explanation analyzes the global behavior of the model instead of a
particular decision.

The following table shows the supported interpretability algorithms in OmniXAI:

MethodModel TypeExplanation TypeEDATabularImageTextTimeseriesFeature analysisNA
Global✅Feature selectionNAGlobal✅Partial dependence plotBlack boxGlobal✅
Sensitivity analysisBlack boxGlobal✅LIME [1]Black boxLocal✅✅✅SHAP [2]Black box
Local✅✅✅✅Integrated Gradients (IG) [3]PyTorch or TensorFlowLocal✅✅✅
CounterfactualBlack boxLocal✅✅✅✅Contrastive explanationPyTorch or TensorFlow
Local✅Grad-CAM [4]PyTorch or TensorFlowLocal✅Learning-to-explain (L2X)Black box
Local✅✅✅Linear modelsLinear modelsGlobal/local✅Tree modelsTree models
Global/local✅

Some other key features of OmniXAI:

* Supports feature analysis and selection — for example, analyzing feature
correlations, checking data imbalance issues, selecting features via mutual
information
* Supports most popular machine learning frameworks or models, such as PyTorch,
TensorFlow, scikit-learn, and customized black-box models
* Includes most popular explanation methods, such as
feature-attribution/importance explanation (LIME [1], SHAP [2], Integrated
Gradients (IG) [3], Grad-CAM [4], L2X), counterfactual explanation (MACE
[5]), partial dependence plots (PDP), and model-specific methods (linear and
tree models)
* Can be applied on tabular, vision, NLP, and time-series tasks.

The table below summarizes these powerful features of OmniXAI, and shows how
OmniXAI compares to other existing explanation libraries, such as IBM’s AIX360,
Microsoft’s InterpretML, and Alibi. Note that OmniXAI is the only one of these
libraries to include all of methods listed (see second and third columns).

Data TypeMethodOmniXAIInterpretMLAIX360Eli5CaptumAlibiexplainXTabularLIME✅✅✅✅
SHAP✅✅✅✅✅✅Partial dependence plot✅✅Sensitivity analysis✅✅Integrated Gradients✅✅✅
Counterfactual✅✅L2X✅Linear models✅✅✅✅✅✅Tree models✅✅✅✅✅✅ImageLIME✅✅SHAP✅✅
Integrated Gradients✅✅✅Grad-CAM✅✅✅Contrastive explanation✅✅✅Counterfactual✅✅L2X✅
TextLIME✅✅✅SHAP✅✅Integrated Gradients✅✅✅Counterfactual✅L2X✅TimeseriesSHAP✅
Counterfactual✅

Other notable features of OmniXAI:

* In practice, users may choose multiple explanation methods to analyze
different aspects of AI models. OmniXAI provides a unified interface so they
only need to write a few lines of code to generate various kinds of
explanations.
* OmniXAI supports Jupyter Notebook environments.
* Developers can add new explanation algorithms easily by implementing a single
class inherited from the explainer base class.
* OmniXAI also provides a visualization tool for users to examine the generated
explanations and compare interpretability algorithms.

OmniXAI’s Design

The key design principle of OmniXAI is that users can apply multiple explanation
methods and visualize the corresponding generated explanations at the same time.
This principle makes our library:

* easy to use (generating explanations by writing just a few lines of code)
* easy to extend (adding new explanation methods easily without affecting the
library framework), and
* easy to compare (visualizing the explanation results to compare multiple
explanation methods).

The following figure shows the pipeline for generating explanations:

Given an AI model (PyTorch, TensorFlow, scikit-learn, etc.) and data (tabular,
image, text, etc.) to explain, one only needs to specify the names of the
explainers they want to apply, the preprocessing function (e.g., converting raw
input features into model inputs), and the postprocessing function (e.g.,
converting model outputs into class probabilities for classification tasks).
OmniXAI then creates and initializes the specified explainers and generates the
corresponding explanations. To visualize and compare these explanations, one can
launch a dashboard app built on Plotly Dash.

Example Use Cases

Ex.1: Explaining Tabular Data for Data Science Tasks

Let’s consider an income prediction task. The dataset used in this example is
the Adult dataset [https://archive.ics.uci.edu/ml/datasets/adult] including 14
features, such as age, workclass, and education. The goal: predict whether
income exceeds $50K per year on census data.

We train an XGBoost classifier for this classification task, then apply OmniXAI
to generate explanations for each prediction given test instances. Suppose we
have an XGBoost model with training dataset “tabular_data” and feature
processing function “preprocess”, and we want to analyze feature importance,
generate counterfactual examples, and analyze the dependence between features
and classes.

The explainers applied in this example are LIME, SHAP for feature
attribution/importance explanation, MACE for counterfactual explanation, and PDP
for partial dependence plots. LIME, SHAP, and MACE generate local
(instance-level) explanations, such as showing feature importance scores and
counterfactual examples for each test instance; PDP generates global
explanations, showing the overall dependence between features and classes. Given
the generated explanations, we can launch a dashboard for visualization by
setting the test instances, the generated local and global explanations, the
class names, and additional parameters for visualization.

The following figure shows the explanation results generated by multiple
explainers (LIME, SHAP, MACE, and PDP) given a test instance:

One can easily compare the feature-attribution explanations generated by LIME
and SHAP: the predicted class is ” 50k” instead of
“< 50k”. PDP explains the overall model behavior — for instance, income will increase as “Age” increases from 20 to 45, then income will decrease when “Age” increases from 45 to 80; longer working hours per week may lead to a higher income; and married people have higher income (note: this is a potential bias in the dataset). Overall, by using this dashboard, one can easily understand why such predictions are made, whether there are potential issues in the dataset, and whether more feature processing/selection is needed. Ex.2: Explaining Image Data for Visual Recognition Tasks Our next example is an image classification task. We choose a ResNet pre-trained on ImageNet for demonstration. For generating explanations, we apply the LIME, Integrated Gradients, and Grad-CAM explainers. Note that we apply Grad-CAM with different parameters — for example, the target layers of “gradcam0” and “gradcam2” are “layer4[-1]” while the target layers of “gradcam1” and “gradcam3” are “layer4[-3]”, and the label to explain for the first two Grad-CAM explainers is “dog” while for the other Grad-CAM explainers it’s “cat”. The following figure shows the explanation results generated by these explainers. As you can see, Grad-CAM generates four local explanations (GRADCAM#0-#3, one for each of the different parameter scenarios), each with its own visualization. Ex.3: Explaining Text Data for NLP Tasks OmniXAI also supports NLP tasks. Let’s consider a sentiment classification task on the IMDb dataset where the goal is to predict whether a user review is positive or negative. We train a text CNN model for this classification task using PyTorch, then apply OmniXAI to generate explanations for each prediction given test instances. Suppose the processing function which converts the raw texts into the inputs of the model is “preprocess”, and we want to analyze word/token importance and generate counterfactual examples. The figure below shows the explanation results generated by explainers LIME, Integrated Gradients, and Polyjuice. Clearly, LIME and Integrated Gradients show that the word “great” has the largest word/token importance score, which implies that the sentence “What a great movie! if you have no taste.” is classified as “positive” because it contains the word “great”. The counterfactual method generates several counterfactual examples for this test sentence — such as, “what a terrible movie! if you have no taste.” — helping us understand more about the model’s behavior. Ex.4: Explaining Time Series Anomaly Detection OmniXAI can be applied in time series anomaly detection and forecasting tasks. It currently supports SHAP-value-based explanation for anomaly detection and forecasting, and counterfactual explanation for anomaly detection only. We will implement more algorithms for time-series-related tasks in the future. This example considers a univariate time series anomaly detection task. We use a simple statistics-based detector for demonstration — e.g., a window of time series is detected as an anomaly according to some threshold estimated from the training data. The following figure shows the explanation results generated by SHAP and MACE: The dashed lines demonstrate the importance scores and the counterfactual example, respectively. SHAP shows the most important timestamps making this test time series detected as an anomaly. MACE provides a counterfactual example showing that it will not be detected as an anomaly if the metric values from 20:00 to 00:00 are around 2.0. From these two explanations, one can clearly understand the reason why the model considers it an anomaly. The Big Picture: Societal Benefits and Impact Using an XAI library such as OmniXAI in the real world should have a positive effect on AI and society, for multiple reasons: * The decisions made by AI models in real-world applications such as healthcare and finance potentially have a huge societal impact, yet the lack of explainability in those models hampers trust in AI systems and inhibits their adoption. OmniXAI provides valuable tools to “open” AI models and explain their decisions to improve the transparency and persuasiveness of AI systems, helping customers understand the logic behind the decisions. This not only gives customers peace of mind and increased knowledge, but also fosters a general trust in AI systems, which helps speed the adoption of AI systems and bring their benefits to more areas of society. * Real-world applications usually need to handle complicated scenarios, which means AI models may sometimes make unsatisfying or wrong decisions due to data complexity. When that happens, explaining why an AI model fails is vitally important. Such explanations not only help maintain customer confidence in AI systems, but also helps developers resolve issues to improve model performance. OmniXAI analyzes various aspects of AI models and figures out the relevant issues when a wrong decision is made, helping developers quickly understand the reasons behind failures and improve the models accordingly. * While OmniXAI has many potential positive impacts, the misuse of OmniXAI may result in negative impacts. In particular, the OmniXAI library should not be used to explain misused AI/ML models or any unethical use case. It also should not be used to enable hacking of models through public prediction APIs, stealing black-box models, or breaking privacy policies. The Bottom Line * OmniXAI is designed to address many of the pain points in explaining decisions made by AI models. * OmniXAI aims to provide data scientists, machine learning engineers, and researchers with a comprehensive one-stop solution to analyze, debug, and explain their AI models. * We will continue to actively develop and improve OmniXAI. Planned future work includes more algorithms for feature analysis, more interpretable algorithms, and supports for more data types and tasks (for example, visual-language tasks and recommender systems). We will also improve the dashboard to make it more comprehensive and better to use. * We welcome and encourage any contributions from the open source community. Explore More Salesforce AI Research invites you to dive deeper into the concepts discussed in this blog post (links below). Connect with us on social media and our website to get regular updates on this and other research projects. * Paper: Learn more about OmniXAI by reading our research paper. [https://arxiv.org/abs/2206.01612] * Code: See our OmniXAI library [https://github.com/salesforce/OmniXAI] Github page. * Demo: Check out our Explanation Dashboard Demo [https://sfr-omnixai-demo.herokuapp.com/]. * Tutorials: Check out more detailed examples and tutorials of OmniXAI [https://github.com/salesforce/OmniXAI/tree/main/tutorials]. * Documentation: Explore OmniXAI documentation [https://opensource.salesforce.com/OmniXAI/latest/index.html]. * Contact us: omnixai@salesforce.com. * Follow us on Twitter: @SalesforceResearch [https://twitter.com/SFResearch], @Salesforce [https://twitter.com/Salesforce]. Related Resources [1] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 1135–1144, 2016. [2] Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 4765–4774. Curran Associates, Inc., 2017. [3] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning – Volume 70, ICML’17, page 3319–3328. JMLR.org, 2017. [4] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128(2):336–359, Oct 2019. [5] Wenzhuo Yang, Jia Li, Caiming Xiong, and Steven C.H. Hoi. MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation, 2022. URL https://arxiv.org/abs/2205.15540. About the Authors Wenzhuo Yang is a Senior Applied Researcher at Salesforce Research Asia, working on AIOps research and applied machine learning research, including causal machine learning, explainable AI, and recommender systems. Steven C.H. Hoi is Managing Director of Salesforce Research Asia and oversees Salesforce’s AI research and development activities in APAC. His research interests include machine learning and a broad range of AI applications. Donald Rose is a Technical Writer at Salesforce AI Research. Specializing in content creation and editing, Dr. Rose works on multiple projects, including blog posts, video scripts, news articles, media/PR material, social media, writing workshops, and more. He also helps researchers transform their work into publications geared towards a wider audience.