During the last couple of years, the interpretability and explainability of AI systems are gaining momentum as a result of the need to ensure the transparency and accountability of AI operations, while at the same time minimizing the consequences of poor decisions. Explainable AI (XAI) research aims at providing a set of techniques that produce more explainable models, while maintaining a high level of searching, learning, planning, and reasoning performance. XAI facilitates human users to understand, trust, and effectively manage AI systems. Research works have introduced different measures and frameworks for XAI. Most of these frameworks focus on defining model explainability, formulating explainability tasks for understanding model behavior, developing solutions for these tasks, and specifying measures and techniques for evaluating the performance of models in explainability tasks. The spectrum of proposed methods is very wide as it includes methods for all the different flavors of AI systems (e.g., Machine Learning, Deep Learning, Robotics, Multi-Agent Systems) and for a wide array of vertical application in sectors like healthcare, finance, human resources and industry.
XAI facilitates human users to understand, trust, and effectively manage AI systems.
Three of the most popular families of XAI techniques are:
- Perceptive interpretability: Include interpretabilities that are rather obvious and can therefore be humanly perceived (e.g., algorithms that classify some objects in given segments/categories).
- Saliency Techniques: These are specific cases of perceptive interpretability methods. They explain the decision of an algorithm by assigning values (e.g., probabilities or heatmaps) that reflect the importance of input components in their contribution to that decision. Popular saliency techniques are the ones that decompose the algorithms into discriminative features for classification, such as the DeepLIFT and Prediction Difference Analysis techniques.
- Signal Methods: Signal methods for deep learning systems provide interpretability through observing the stimulation of neurons or a collection of neurons. The activated values of neurons can be manipulated or transformed into interpretable form.
- Verbal Interpretability: These techniques provide interpretability in a form of verbal chunks/rules that human can naturally understand (e.g., sentences that indicate causality).
- Feature Extraction and Feature Engineering Techniques: Feature Extraction techniques provide insights on explainability and interpretability of AI models, through identifying the features that are strongest predictors of the AI outcomes. In a follow blog post we will illustrate how one can implement techniques like DeepLIFT in practice.
Stay tuned!