research-article

Open access

A Roadmap of Explainable Artificial Intelligence: Explain to Whom, When, What and How?

Authors:

Ziming Wang,

Changwu Huang,

Xin YaoAuthors Info & Claims

ACM Transactions on Autonomous and Adaptive Systems, Volume 19, Issue 4

Article No.: 20, Pages 1 - 40

https://doi.org/10.1145/3702004

Published: 24 November 2024 Publication History

PDF eReader

Abstract

Explainable artificial intelligence (XAI) has gained significant attention, especially in AI-powered autonomous and adaptive systems (AASs). However, a discernible disconnect exists among research efforts across different communities. The machine learning community often overlooks “explaining to whom,” while the human-computer interaction community has examined various stakeholders with diverse explanation needs without addressing which XAI methods meet these requirements. Currently, no clear guidance exists on which XAI methods suit which specific stakeholders and their distinct needs. This hinders the achievement of the goal of XAI: providing human users with understandable interpretations. To bridge this gap, this article presents a comprehensive XAI roadmap. Based on an extensive literature review, the roadmap summarizes different stakeholders, their explanation needs at different stages of the AI system lifecycle, the questions they may pose, and existing XAI methods. Then, by utilizing stakeholders’ inquiries as a conduit, the roadmap connects their needs to prevailing XAI methods, providing a guideline to assist researchers and practitioners to determine more easily which XAI methodologies can meet the specific needs of stakeholders in AASs. Finally, the roadmap discusses the limitations of existing XAI methods and outlines directions for future research.

1 Introduction

With the rapid development of AI and its growing role in autonomous and adaptive systems (AASs), its impact and application continue to expand, encompassing areas such as face recognition [56], medical diagnosis [37], product recommendation [71], and so on. However, the widespread use of AI, particularly in AASs, presents many ethical challenges, particularly regarding the challenges and risks associated with opaque and unexplainable AI models [92]. Many prevalent machine learning (ML) models are regarded as “black box” models, implying that the decisions made by AI systems lack explainability. This lack of explainability poses significant trust issues for users, especially in critical areas such as healthcare, finance, autonomous driving, and law, where AASs are increasingly adopted. Therefore, explainable AI (XAI) has received a lot of attention in recent years [10, 80], with the goal of providing human users with appropriate explanations that they can understand [10]. Although the article talks about AI systems, almost all discussions are equally applicable to AASs. This is because almost all AASs use AI techniques to a certain extent. Thus, explainability is also essential for decisions made by autonomous systems, especially if the decision will interact with human beings or the natural environment.

To achieve this, in the ML community, many XAI methods have been proposed to interpret the ML model's decision or behavior [10, 80]. However, current research in XAI has encountered criticism, with the metaphorical notion of “inmates running the asylum” [142], highlighting a significant gap between existing XAI methods and the needs of real-world users [120]. This gap arises from the fact that the development of XAI methods is largely based on researchers’ assumptions rather than a thorough understanding of stakeholders’ specific needs [58]. In other words, ML researchers frequently create new XAI methods (i.e., how to explain) without considering the intended audience, simply assuming that the recipients of explanations are either “users” or “AI experts” [22, 72, 205]. Consequently, there have been increasing calls in the ML community to follow a human-centered approach in XAI research [120, 187], especially in AASs that take into account the stakeholders to whom explanations are provided.

In contrast, within the human-computer interaction (HCI) community, a number of studies have independently considered various stakeholders associated with XAI methods (i.e., explain to whom) [47, 67, 72, 94, 115, 121, 123, 138, 140, 145, 171, 179, 182, 210], the stages in the AI system lifecycle at which explanations are provided (i.e., when to explain) [47, 179, 205], and the questions or inquires that stakeholders could pose (i.e., what to explain) [120, 123, 145]. Despite these efforts to explore XAI in different dimensions, critiques have emerged regarding the breadth and depth of such research. Suresh et al. [205] pointed out that the existing research is too coarse in its delineation of stakeholders, while Bingley et al. [24] emphasized the need to focus further on the stakeholders’ need for explanations. Further, Bhatt et al. [22] found that the vast majority of work only identifies what different stakeholders there are, ignoring their specific needs. In addition, the existing research is incomplete in its consideration of the stages of explanation and the questions that stakeholders may ask. Overall, existing studies are unclear about which appropriate XAI methods stakeholders should choose to provide explanations to meet their needs, leaving a gap between achieving the goal of XAI, i.e., to provide human users with appropriate explanations that are easy for them to understand. This issue is particularly pressing in AASs, where appropriate and accurate explanations can directly impact trust and system reliability. Therefore, this article improves existing reviews of XAI by providing a more refined and comprehensive roadmap that links different pillars of XAI together, especially within the context of AASs. In particular, we consider nine different stakeholders and their respective needs for explanations, four stages of explanations that span the entire AI system lifecycle, and nine questions that stakeholders might ask.

To the best of our knowledge, while prior studies have individually delved into issues of explaining to whom, when, what, or how, there exists a conspicuous gap in the research that consolidates all four dimensions together in a single framework, with only limited efforts made to explore the interactions between pairs of these dimensions [47, 120, 205]. As observed by Vermeire et al. [218], there is a lack of thorough and sufficient cross-referencing of relevant studies on XAI from the ML perspective (relevant studies from the ML community) and from the user perspective (relevant studies from the HCI community). This leads to a lack of clarity among stakeholders regarding the availability of suitable XAI methods that can effectively address their needs, affecting the practical utilization of XAI methods [218] and thus hindering the achievement of the goal of XAI, which is to provide human users with explanations that they can understand [10]. Therefore, this article emphasizes the creation of a roadmap that connects stakeholders’ needs with XAI methods, utilizing “what to explain” questions as a bridge to holistically examine the four facets of explaining to whom, when, what, and how. This roadmap provides a guideline to help various stakeholders in AASs select the appropriate XAI method to meet their needs for explanation, so as to achieve the goal of XAI [10].

The main contributions of this article are summarized as follows:

(1)

We introduce a comprehensive ML ecosystem with a fine-grained division of nine different stakeholders. Through an extensive literature review, we have clarified each stakeholder's needs for explanations at different stages of the AI system lifecycle, forming a crucial part of the roadmap.

(2)

A new type of “what to explain” question is proposed to complement the existing work in the literature, further enriching the roadmap.

(3)

By using “what to explain” questions as a bridge to integrate the four dimensions of to whom, when, what, and how to explain, we have established a connection between stakeholders’ needs and the existing XAI methods. This connection forms the core of the roadmap, providing a guideline to help stakeholders choose the appropriate XAI method to meet their needs for explanation. It also enables us to see more clearly and precisely what has been done or is missing in the existing literature.

(4)

Two case studies are provided to illustrate how the roadmap and its guidance can be used to assist stakeholders in AASs in selecting appropriate XAI methods to meet their explanation needs.

(5)

Through our structured and systematic roadmap, major gaps in the current literature are identified from the point of view of the AI system lifecycle, enabling readers to understand in more detail the potential challenges of meeting the goals of XAI in real-world applications of AI and AASs.

The rest of this article is organized as follows. Section 2 describes the integrated structure of our roadmap. Section 3 presents the ML ecosystem proposed in this article, considering various stakeholders and their needs at different stages of the AI system lifecycle. Section 4 introduces the nine “what to explain” questions considered in this article. Section 5 summarizes the 10 categories of XAI methods considered in this article. Section 6 uses the “what to explain” questions as a bridge to connect stakeholders’ needs with XAI methods, thus providing a guideline within our roadmap to help stakeholders select the appropriate XAI method to meet their needs for explanation. Section 7 presents two case studies to demonstrate the practical application of this roadmap and guideline. Section 8 discusses research gaps identified through our study and presents some future research directions. Section 9 concludes the article.

2 Four-Pillar Structure of Our Roadmap

In the realm of XAI, to achieve the goals of XAI, there are four key questions [47, 115]: (1) Whom are the intended audiences or stakeholders for the explanation, and what are their specific needs [72, 171, 210]? (2) When in the AI system lifecycle to provide explanations to the stakeholders [47, 179, 205]? (3) What inquiries will stakeholders pose in their quest for explanations? What do they want [120, 123, 145]? (4) How should these questions be effectively answered [10, 74, 80, 125, 175, 257]? While each of these questions has been independently studied by researchers, there has been a lack of a comprehensive overview that addresses all four questions together and analyzes their interconnections, particularly in the context of AASs.

To bridge this gap, we have taken a holistic approach by summarizing these questions as the four central pillars of XAI, which are critical to applying XAI in a way that meets stakeholders’ needs. These pillars provide the foundation for a complete view of XAI, as illustrated in Figure 1. For each of these pillars, researchers have conducted relevant studies, which are described below individually.

Fig. 1.

2.1 Explain to Whom

Within the HCI community, there are many research efforts that consider different stakeholders [47, 67, 72, 94, 115, 121, 123, 138, 140, 145, 171, 179, 182, 210] in the context of AI and AASs, including AI experts, decision makers, regulators, users, and so on. However, such work has suffered some criticism. Among them, Suresh et al. [205] pointed out that many works have a very coarse granularity of stakeholder categorization. Bhatt et al. [22] found that the vast majority of the work only considered what different stakeholders there are, without clarifying the specific needs of each stakeholder for explanation. In response to these critiques, through extensive literature research, this article considers nine types of stakeholders at a fine-grained level and specifies each stakeholder's need for explanation. For example, the general end-users of autonomous vehicles may need to explain how the system makes navigation decisions in order to build appropriate trust.

2.2 When to Explain

A small number of XAI-related studies have considered when to provide explanations. Among them, Rosenfeld et al. [179] were the first to divide explanations into before the task, continuing explanations throughout the task, and after the task, based on when the explanation is presented. Suresh et al. [205] considered four stages: development, deployment, immediate usage, and downstream impact and linked them to goals and objectives distilled from stakeholders’ needs. Dhanorkar et al. [47] divided the AI system lifecycle into three stages: model development, model validation during proof of concept and model in-production, and related them to AI experts as well as subject matter experts and their motivation for explainability. However, the AI system lifecycles considered in the literature [205] and [47] are incomplete, as they commence from the modeling stage, disregarding the preceding stages. To address this gap, this article considers a more comprehensive AI system lifecycle model proposed in the literature [6, 63, 237].

2.3 What to Explain

The researchers considered various questions that stakeholders might ask to a black-box ML model [120, 123, 145]. Lim and Dey [123] considered five types of questions that end-users of the system might ask. Mohseni et al. [145] reviewed six “what to explain” questions commonly employed in the design of XAI systems. Inspired by the work of Lim and Dey [123], Liao et al. [120] represented users’ needs for explainability in terms of questions a user might ask. Through interviews with 20 designers and user experience practitioners involved in various AI products, they [120] developed an algorithm-based library of XAI questions. However, the questions that may be asked by the stakeholders considered in the existing research are mainly focused on seeking explanations for the decision-making outcomes of the ML model, while neglecting to seek insights within the ML model, such as what representations or knowledge a neural network has learned. This article introduces a new “what to explain” question, focusing on insights into the ML model's internals, as a complement to the existing work, considering a total of nine different “what to explain” questions in a more comprehensive way. For example, in an adaptive recommender system, the general end-users may want to know why the system made a particular recommendation.

2.4 How to Explain

In order to meet stakeholders’ needs for explanations, researchers in the ML community have proposed various XAI approaches, including feature attribution explanation (FAE) [130], partial dependence plot (PDP) [68], counterfactual (CF) [223] , and others. In this article, we consider a total of ten classes of XAI methods in terms of how to explain, based on several widely recognized survey papers [10, 74, 80, 125, 175, 257].

While there are papers addressing to whom, when, what, or how to explain separately, as mentioned previously, no existing work has ever considered all four pillars in a single framework and discussed interplays among the four pillars. A few studies considered two pillars only. Suresh et al. [205] and Dhanorkar et al. [47] considered stakeholders’ needs at different stages of explanation (when), while Liao et al. [120] related the questions that stakeholders would ask (what) to several XAI methods (how). However, the lack of a comprehensive overview encompassing all four pillars has hindered practical applications of XAI methods, as it is unclear what stakeholders’ needs for explanation are, what questions they might ask, as well as which XAI methods stakeholders should use and when to use them to meet their specific needs for explanation.

To fill this gap, this article constructs a roadmap that considers all four pillars of XAI simultaneously in order to present a holistic picture of “when to explain, what to explain, to whom, to meet what need, and how.” Our roadmap involves identifying the stakeholders and their needs for explanations throughout different stages of the AI system lifecycle, determining the questions they would pose to address their needs, and assessing the existing XAI methods that can offer corresponding answers. By establishing these connections among the four pillars, we can gain a comprehensive understanding of the inter-relationships among the four pillars and provide a guideline to assist stakeholders in selecting the appropriate XAI method to meet their needs for explanation.

3 When to Explain, to Whom, and to Meet What Need

In this section, we first introduce the four stages of the AI system lifecycle [6, 63, 237] that are taken into account in this article. Subsequently, we present the nine distinct classes of stakeholders considered and outline their respective needs for explanation at different stages of the AI system lifecycle, thereby establishing the foundation of our roadmap.

3.1 AI System Lifecycle under Consideration

Based on the proposed AI system lifecycle models [6, 63, 237], this article considers four major stages relevant to AI and AASs:

Model Requirement Analysis: At this initial stage, the meticulous definition of project goals and identification of desired functionalities are paramount [83]. For example, in an autonomous vehicle project, stakeholders might specify safety as a primary goal, influencing the selection of algorithms and models. In addition, selecting an appropriate formulation for the specific problem is important [83].

Data Acquisition and Understanding: This stage entails the discovery and integration of relevant datasets. It involves essential tasks such as data cleaning, elimination of inaccuracies or noise, annotation of ground truth labels, comprehensive exploration, and comprehension of the dataset, as well as rigorous quality analysis and evaluation [6, 63]. For example, in an autonomous vehicle application, understanding and processing data from various sensors are vital for accurate navigation.

Modeling AI: This stage includes feature engineering, ML model training, ML model evaluation, ML model validation, and so on [63, 83]. The ultimate goal is to build a high-performing ML model that can adapt to changing environments, such as those found in autonomous driving systems.

Deployment, Monitoring, and Interaction: This stage requires tracking the behavior of the ML model to ensure that it performs as expected and to monitor for errors that may occur during the ML model's execution [6, 83]. For example, in an autonomous vehicle, real-time monitoring is crucial to adjust its driving behavior based on road conditions and obstacles. In addition, many stakeholders may interact with the ML model at this stage.

3.2 Identify Stakeholders and Their Needs for Explanation at Different Stages

We consider nine stakeholders who directly or indirectly interact with the ML system, forming the ML ecosystem, as shown in Figure 2. From Figure 2, we can see that all stakeholders, except for the decision makers, the subjects of decision, and the data subjects, engage in direct two-way interactions with the ML system. Decision makers interact bidirectionally with both model operators and subjects of decision, receiving the ML system's decision results from the ML model operators and conveying the final decision outcome to the subjects of decision. Data subjects, on the other hand, only engage with the ML system in a unidirectional manner, where the ML system receives data from the data subjects, but there is no reciprocal interaction. Thus, the arrow between the data subjects and the ML system in Figure 2 represents a unidirectional flow. In the rest of this section, we will delve into each class of stakeholders individually, outlining their specific needs for explanation at different stages of the AI system lifecycle.

Fig. 2.

3.2.1 AI Theory Experts.

AI theory experts generally refer to the theoretical researchers who drive the development of AI. Their primary focus lies in exploring the properties, internal information, and logic of black-box models, particularly deep neural networks (DNNs), rather than developing specific applications [94, 171, 181].

■

When to explain: Modeling AI; Deployment, Monitoring and Interaction.

—

Need: Insight and understanding of the internal logic of complex ML models. Complex black-box models undermine people's trust in them by making it impossible to understand their decision logic due to their opaqueness, which largely affects their application and development [79, 167]. AI theory experts aim to leverage XAI to reveal what black-box models have learned, to better understand the decision logic of black box models and their properties, and thus to gain new insights and knowledge about how black box models learn from data [242].

3.2.2 AI Development Experts.

AI development experts primarily refer to professionals involved in building AI applications. AI development experts have different responsibilities as well as the need for explanations at different stages of the AI system lifecycle [108, 137], which are described below.

■

When to explain: Model Requirements Analysis.

—

Need: Comparing analysis of multiple ML models. In the model requirements analysis stage, AI development experts need to select a suitable ML model from a set of candidate models [65]. Solely depending on performance metrics, such as accuracy or F1 score, often proves inadequate for making an informed choice [204]. Instead, they strive to gain deeper insights into the ML models by comprehending their execution details and underlying logic [155]. This deeper comprehension aids in providing guidance for choosing the most suitable model for a given problem. XAI methods can provide insight into models to help AI development experts compare models and architectures [181, 183].

■

When to explain: Data Acquisition and Understanding.

—

Need 1: Insight and understanding of datasets. An ML model is usually built from data, and gaining insight and improving understanding of the dataset is instrumental in building more effective ML models [208]. Therefore, before modeling, AI development experts seek to gain insights from datasets to enhance their comprehension of datasets. Through the application of XAI methods, they can analyze data distribution, data quality, and feature relationships within the data [102].

—

Need 2: Analyzing potential errors, noise, and bias in the dataset. Gudivada et al. [78] emphasized the criticality of high-quality datasets in the development of ML models. However, datasets often contain noise, bias, and even mislabeled samples [201]. Therefore, AI development experts can improve the quality of datasets by using XAI methods to detect and identify potential errors, noise, biases, and other issues in the datasets [141].

■

When to explain: Modeling AI.

—

Need 1: Assisting with feature selection. Feature engineering involves transforming raw data into new representations to address specific data problems [49], thus improving the generalization ability of ML models. This step is often regarded as a time-consuming and technically challenging step [154, 225]. AI development experts can analyze and gain deeper insight into the data and models using XAI methods to guide feature selection.

—

Need 2: Optimizing model architecture and hyper-parameters. As deep learning continues to grow in popularity, identifying optimal combinations of model architecture and hyper-parameters has become increasingly crucial for building high-performing models [225]. AI development experts can leverage XAI methods to guide the tuning of model architectures and hyper-parameter settings, ultimately enhancing model performance.

—

Need 3: Checking the model's decisions. AI development experts possess a natural inclination to delve deeper into the ML model's decision-making process rather than just analyzing its outputs. They strive to comprehend the overall logic inside the constructed model and how it arrives at specific decisions [47]. Therefore, AI development experts rely on XAI methods to interpret the ML model in order to assess whether its behavior aligns with expectations [115]. This understanding is critical in safety-critical AASs like autonomous vehicles.

—

Need 4: Guiding model debugging and error refinement. During the modeling stage, AI development experts continuously debug and iterate to enhance the ML model's performance [30]. They aspire to understand the reasons behind the ML model's underperformance in certain challenging scenarios [22], as well as identify potential biases and discrimination within the ML model. XAI methods offer valuable assistance in diagnosing model issues and can be employed to mitigate bias, ensuring social responsibility and fairness [73].

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Adjusting the ML model to meet the user's expectations and needs. The performance of AI systems sometimes defies user expectations, and this mismatch of expectations becomes a barrier to HCI [153]. In instances where the ML model's decision result does not align with the user's desired outcome, XAI methods can assist AI development experts in adapting the ML model to better match user expectations [47].

—

Need 2: Assessing the impact of dataset shift. After deploying a model, there may be a significant shift in the distribution of predicted data compared to the training data [7, 32], which greatly affects the accuracy and reliability of trained ML models [231]. AI development experts utilize XAI methods to analyze and understand the effects of dataset shifts on the model's decision-making capabilities [22].

3.2.3 Model Validation Team.

The model validation team typically consists of enterprise owners/managers [18] and/or a group of senior experts in the relevant field [72]. Their responsibilities involve reviewing and validating the training data and the ML model to assess the compatibility of the training data with their proprietary data and to determine if the ML model aligns with the specific needs of their enterprise. Additionally, they evaluate whether the ML model meets regulatory requirements in terms of privacy and security [47], ultimately making decisions on whether to trust and deploy the ML model in systems such as autonomous vehicles or adaptive healthcare solutions.

■

When to explain: Data Acquisition and Understanding.

—

Need: Evaluating data suitability. The model validation team requires comprehensive knowledge of the training data to evaluate its compatibility with their proprietary data [47]. Therefore, they rely on XAI methods to gain insights into the training data as well as compare and analyze them along with their proprietary data.

■

When to explain: Modeling AI.

—

Need 1: Reviewing the ML model's decision logic. Through the use of XAI methods, the model validation team seeks to enhance their global and local understanding of the ML model. Their goal is to effectively review and validate whether the ML model's decision logic aligns with the pertinent domain expertise [72, 221].

—

Need 2: Determining compliance with regulations. A paper by the UK's Financial Conduct Authority stated that boardrooms must learn to tackle some significant issues emerging from AI, particularly in terms of ethics, accountability, transparency, and responsibility [64]. The model validation team, especially business owners/managers, relies on XAI methods to review models’ compliance with relevant legal regulations, such as requirements for ethics, privacy, security, and fairness[47].

3.2.4 Model Operators.

Model operators [210] refer to the objects that interact directly with the ML model/system, such as loan officers [2, 11], medical professionals [91], and operators of autonomous systems like drones or self-driving cars.

■

When to explain: Deployment, Monitoring and Interaction.

—

Need: Ensuring correct and efficient interaction. Model operators interact directly with the ML model/system and can utilize XAI methods to help them better understand the ML model and its decisions, so that they can interact with the ML model correctly and efficiently [210].

3.2.5 Decision Makers.

After obtaining the information passed from the model operators, the decision makers are responsible for making the final decision with the aid of model output and their own experience.

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Comprehending specific model decisions. While obtaining the decision results of the ML model, comprehending the rationale behind specific decisions enhances decision makers’ confidence and trust [72]. This understanding allows decision makers to integrate their knowledge with the ML model's information effectively, enabling them to make informed decisions [60, 210].

—

Need 2: Deepening overall understanding of the ML model and improving decisions. Decision makers leverage XAI methods to grasp the decision boundaries, classification thresholds, and decision logic of black box models. This deeper understanding empowers them to deepen their overall understanding of the ML models and refine and improve their own decision logic [72].

3.2.6 Regulators.

Regulators safeguard the interests of the subjects of decision and general end-users [138]. They review models for compliance with relevant standards or regulations, including reviewing models for interpretability and fairness, as well as the allocation of responsibility in the event of an accident, and so on.

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Ancillary model review. In addition to interpretability and transparency, relevant standards or regulations make relevant requirements for other aspects of AI systems. For example, the EU High Level Expert Group on AI has established guidelines for trustworthy AI, which outline requirements for AI safety, robustness, privacy protection, and fairness [41]. XAI methods can provide explanations to assist in reviewing whether AI systems fulfill the stipulations of relevant regulations [2, 73].

—

Need 2: Assisting in apportioning responsibility. When errors or accidents occur, regulators are responsible for attributing accountability [192]. However, the inherent complexity and lack of transparency in AI systems can make this task daunting [144]. XAI methods can assist in apportioning responsibility [53] by revealing the causes of errors and mishaps through techniques such as attribution.

3.2.7 Data Subjects.

The General Data Protection Regulation (GDPR) defines a data subject as “who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person” [214]. In this article, data subjects are considered objects whose personal data are utilized for model training but do not directly interact with the ML model.

■

When to explain: Modeling AI; Deployment, Monitoring and Interaction.

—

Need: Protecting personal data information. Data subjects seek to understand the types of information utilized by the ML model and how such information influences the ML model's decisions, along with the potential implications [205, 210]. This understanding is crucial for protecting their lawful rights concerning personal data, as outlined in the relevant legal frameworks [70].

3.2.8 Subjects of Decision.

Subjects of decision are those who are affected by the decisions made by decision makers [210]. Although they do not interact directly with the ML system, they are directly affected by its output [138], such as loan applicants or patients in healthcare AASs.

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Understanding the decision and how to maintain or change it. The subjects of decision seek to comprehend how the ML model makes a particular decision and how they can maintain or alter the outcome to align with their expectations.

—

Need 2: Examining bias. Subjects of decision aim to ensure that the decisions obtained are fair [52] and ethical [97]. However, AI systems often exhibit biases, discrimination, and information asymmetry [118] and are difficult to capture due to the opaqueness of the system. XAI methods can assist in tracking and exposing biases and discrimination that occur in the decision-making process of the AI system [30, 73], and even contribute to their mitigation to some extent [115].

3.2.9 General End-Users.

General end-users are individuals who directly interact with AI systems, as distinguished from the subjects of decision. They include individuals who use smart end-products (e.g., smart homes, social media, and e-commerce) in their daily lives.

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Establishing appropriate trust and building mental models. Establishing appropriate levels of trust is crucial for general end-users to engage in positive and satisfactory interactions with the AI system [235]. However, the complexity of the ML model leads to frequent misunderstandings and gaps between it and the user's mental model [185], making it difficult for the user to understand the ML model's behavior and decisions [164, 219]. As a result, general end-users are increasingly demanding transparency in the AI system they rely on daily [33]. XAI methods provide explanations that enable general end-users to develop appropriate trust in the system [89], thus helping them to build accurate “mental models” [233].

—

Need 2: Protecting personal data information. Similar to data subjects, general end-users highly value the protection of personal data information. They seek to understand the types of personal data utilized by AI systems in decision-making and the implications of those decisions [205, 210].

4 What to Explain

Some researchers have pointed out that explanation is actually an answer to a question [120, 142, 161]. Therefore, this section aims to elucidate the types of questions that stakeholders may pose to meet their needs. These questions, referred to as “what to explain” questions, serve as a crucial bridge within our roadmap, connecting stakeholders’ needs with appropriate XAI methods.

As we discussed in Section 2.3, previous research has considered various “what to explain” questions [120, 122, 123, 145]. However, the questions that stakeholders might ask that have been considered in existing research have focused on seeking explanations for the decision-making outcomes of ML models. None of them have considered questions about the inner working mechanisms of an ML model, which is of interest to AI theory experts, AI development experts, and other curious or inquisitive stakeholders. To address this gap, this article introduces a new “what to explain” question, called “What Explanations.” Together with existing questions considered by the literature [120, 122, 123, 145] already, we consider a more comprehensive set of nine questions as follows:

(1)

How Explanations: How does the system work as a whole? How Explanations seek to gain a comprehensive understanding of the system, encompassing its overall logical structure and decision boundaries. For example, how does the autonomous driving system work?

(2)

Why Explanations: Why does the system make a particular decision? Why Explanations aim to comprehend the factors influencing the system's decisions for a given input. They involve understanding the features of the input data, the underlying logic of the ML model, specific examples, and related aspects. For example, why did the car decide to slow down at this specific moment?

(3)

Why-not Explanations: Why does the system not make a particular decision? Why-not Explanations typically ask why the system does not produce a desired output for a specific input, particularly when it deviates from stakeholders’ expectations. The goal is to uncover the reasons for the discrepancies between the system's decisions and stakeholder's expectations. For example, why did the vehicle not take an alternate route when encountering heavy traffic?

(4)

What Explanations: What happens inside the system? What Explanations usually inquire how the system works internally, what potential information the ML model has learned, and to understand the ML model's internal representations. Compared to other questions, such as How Explanations or Why Explanations, which focus on asking how the ML model makes decisions, What Explanations focuses on understanding what knowledge the ML model has learned, such as what a particular structure or neuron has learned or represented. For example, what has the AI system learned internally about recognizing obstacles?

(5)

What-if Explanations: What would the system do if the input changes? What-if Explanations aim to understand how different input data impact the system's decisions. For example, what if a pedestrian suddenly steps in front of the autonomous car, how would it react?

(6)

What-else Explanations: What else are the similar instances? What-else Explanations provide stakeholders with instances that have similar inputs. For example, what else are similar scenarios where autonomous vehicles also make decisions to turn?

(7)

How-to Explanations: How to let the system make another particular decision? How-to Explanations explore how to modify the inputs to obtain different outputs that are consistent with the user's expectations. For example, how to change the environment to allow autonomous vehicles to choose a faster route?

(8)

How-still Explanations: How much of a perturbation can there be while maintaining the same decision? How-still Explanations help stakeholders grasp the extent to which inputs or models can be altered while still obtaining consistent predictive results. For example, how much variation in road conditions can the vehicle withstand while still making the same decisions?

(9)

Data Explanations: Ask for information about the data. Data Explanations involve seeking potential information about both the training and input data, including features, types, ranges, sizes, distributions, potential biases, and disentangled representations. For example, what biases exist in the training data of the autonomous vehicle system?

The above nine types of questions have summarized potential needs from different stakeholders for XAI, particularly in AASs. As can be seen, the purposes of explanations are very different. As a result, the representations and contents of the explanations would be different. The methods for generating the explanations will be different as well. Although there have been several excellent review papers on XAI methods [10, 74, 80, 125, 175, 257], none has clearly stated which XAI methods are used for answering which types of questions at what stages of the AI system lifecycle. Section 6 of this article will fill in this gap within our roadmap.

5 How to Explain

Having thoroughly examined the needs of various stakeholders for explanations and the “what to explain” questions, this section introduces 10 categories of widely used XAI methods. This is a crucial step in our roadmap, as it sets the stage for linking stakeholders’ needs with XAI methods through “what to explain” questions in Section 6. The XAI methods considered in this article were identified through a thorough review of existing popular XAI survey papers [10, 74, 80, 125, 175, 257]. These considered methods broadly encompass the techniques discussed in these surveys, ensuring a comprehensive and complete analysis. It is important to note that our objective here is not to provide an exhaustive review of existing XAI methods. For a more detailed description of these methods and their further categorization, we refer readers to the literature [10, 74, 80, 125, 175, 257]. Our primary objective here is to introduce the ten major categories of XAI methods so that we can discuss, as part of our roadmap, which class of XAI methods are best in addressing which types of questions to fulfill what needs in the next section.

(1)

Decision Trees (DTs) and Decision Rules (DRs). DTs and DRs are considered together in this article due to their similar interpretability properties [257]. They are categorized into global and local explanations based on the scope of the explanation. Global DTs [28, 42, 99, 112] and DRs [43, 98, 190, 211] approximate the explained ML model through techniques such as distillation and rule extraction. Local DTs [256] and DRs [177, 227], on the other hand, serve as surrogate models to explain the model's decisions on a small subset of instances.

(2)

FAE. FAE provides the importance of each feature [228] and can be divided into global and local explanations. Global FAE describes the overall significance of each feature used by the ML model [120], with existing methods including [84, 88, 128, 129, 209, 220]. Local FAE focuses on how much each input feature contributes to the ML model's output for a specific data point [21]. Existing methods include [13, 14, 45, 130, 132, 158, 176, 189, 193, 195], and so on.

(3)

PDP. PDP provides a visual representation of how prediction results vary with different feature values [80] and can be categorized into global and local explanations. Global PDP describes the impact of a set of input features of interest on the prediction results of the ML model as a whole. Classic methods in this category include [9, 68]. Local PDP emphasizes more on the dependency between features and model predictions on a specific sample, with classic methods including [23, 75].

(4)

CF. The CF method considered in this article are divided into CF explanations (CFE) and CF instance (CFI). CFE yields different predicted outcomes (usually the desired outcome) by perturbing the inputs to the ML model [48, 146, 217, 223]. In contrast, CFI provides instances with similar inputs but different prediction results than the instance being explained [116, 120, 151].

(5)

Prototype Explanation (PE). PE typically provides instances that are similar to the data (set) being explained, thus providing insights into the dataset [82, 119].

(6)

Text Explanation (TE). TE refers to providing natural language explanations for the ML model's decisions [175]. Existing methods include [87, 107, 245], and so on.

(7)

Model Visualization (MV). MV refers to visualization techniques that provide insight into the internal workings of a black-box model, in particular visualizing neurons. Existing methods include [61, 62, 199, 242, 244, 248], and so on.

(8)

Graph Explanation (GE). GE refers to linking the input features, the concepts learned by the ML model, and the predictions through a graph structure [175], and methods include [85, 250–254].

(9)

Association Explanation (AE). AE aims to establish associations between inputs [5], internal model structures (e.g., neurons) [51], or outputs [117] with human-understandable conceptual objects (e.g., natural language, images) by modifying the ML model's architecture or incorporating regularization terms [175]. They are divided into input AE (IpAE), internal AE (ItAE), and output AE (OAE).

(10)

Exploratory Data Analysis (EDA). EDA provides insight into data through statistical techniques and visualization [213], aiding in understanding data distribution, quality, feature relationships, and their impact on data objectives [102].

In order to facilitate the mapping between stakeholders’ needs and XAI methods using the “what to explain” questions as a bridge, we discuss the relationship between the nine “what to explain” questions described in Section 4 and the 10 XAI methods. Specifically, we identify which XAI methods can address the respective questions. This mapping forms a critical component of our roadmap. It is worth mentioning that a similar mapping was attempted before [120], although only six “what to explain” questions and five XAI methods were considered. We have extended the previous work [120] and produced a more comprehensive result as given in Table 1, which lays the groundwork for employing “what to explain” questions as a bridge between stakeholders’ needs and XAI methods in the next section.

Table 1.

“What to Explain” Question	XAI Methods
How explanations	Global DT, Global DR, Global FAE, GE
Why explanations	Local DT, Local DR, Local FAE, PE, TE, IpAE, OAE, Global DT, Global DR, CF, GE, ItAE
Why-not explanations	Local FAE, CF, Global DT, Global DR
What explanations	MV, GE, ItAE
What-if explanations	PDP, Global DT, Global DR, GE
What-else explanations	PE, CFI
How to be that explanations	CF, Local PDP, Global DT, Global DR
How to still be this explanations	Local DT, Local DR, PDP, Global DT, Global DR, PE
Data explanations	PE, EDA

Table 1. The Mapping between the “What to Explain” Questions and XAI Methods, Which Serves as an Extension and Refinement of the Related Work in the Literature [120]

A bolded font XAI method means that the method can answer the corresponding “what to explain” question well, while a regular font XAI method may be able to answer the corresponding “what to explain” question, but the answer may not be intuitive or comprehensive.

6 Bridging Stakeholders’ Needs and XAI Methods: A Roadmap and Guideline

In this section, we establish a connection between the stakeholders’ needs outlined in Section 3 and the XAI methods discussed in Section 5 using the “what to explain” questions from Section 4 as a bridge. This connection forms a central component of our roadmap. Specifically, we first relate the stakeholders’ needs to the “what to explain” questions by considering what questions stakeholders are likely to ask in order to satisfy a specific need. Table 1 is then used to determine which XAI methods can answer specific questions and thus satisfy the corresponding needs of the stakeholders. This roadmap provides a guideline to assist stakeholders in selecting the appropriate XAI method to satisfy their need for explanation. Finally, we provide some examples of existing research to illustrate how the corresponding XAI methods can be used to meet stakeholders’ needs. For clarity, we reiterate the full name of each XAI method and its corresponding abbreviation, as shown in Table 2.

Table 2.

Full Name	Abbreviation
Decision tree	DT
Decision rule	DR
Feature attribution explanation	FAE
Partial dependence plot	PDP
Counterfactual explanations/instance	CFE/CFI
Prototype explanation	PE
Text explanation	TE
Model visualization	MV
Graph explanation	GE
Input/Internal/output association explanation	IpAE/ItAE/OAE
Exploratory data analysis	EDA

Table 2. Full Names and Abbreviations of the XAI Methods Considered

In the following, if an XAI method is underlined, it indicates that there has been research using that particular XAI method to meet the corresponding needs. The rest of this section will be organized around different stakeholders, their needs for explanations at different stages of the AI system lifecycle, and how their needs were met by existing XAI methods.

6.1 AI Theory Experts

■

When to explain: Modeling AI; Deployment, Monitoring and Interaction.

—

Need: Insight and understanding of the internal logic of complex ML models.

–

What to explain: What Explanations.

–

Corresponding XAI methods: MV, GE, ItAE.

–

Examples: Several existing works have focused on providing insights into black-box ML models by visualizing neurons using various techniques such as activation maximization [61, 62, 242], deconvolution [244], guided backpropagation [199], and construction and-or graph [250, 251, 252, 253, 254]. For example, Yosinski et al. [242] showed the internal features of each layer in a convolutional neural network (CNN) through an activation-maximization-based MV method, thus helping researchers build valuable intuitions about how CNNs work. There is also work that facilitates the understanding of information learned by each neuron in a neural network by associating neurons with human-understandable semantic concepts [17, 255].

6.2 AI Development Experts

■

When to explain: Model Requirements Analysis.

—

Need: Comparing analysis of multiple ML models.

–

What to explain: How, Why, What Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, MV.

–

Examples: Most of the existing studies compared multiple models by FAE methods [65, 103, 139, 155, 160]. For example, Narkar et al. [155] developed Model LineUpper, a tool that integrates several XAI and visualization techniques, including global and local FAE, to support data scientists manually selecting the final model from dozens of candidates. Mostafa et al. [150] assisted in selecting a better CNN model through an MV explanation method called guided backpropagation by visualizing the neurons in the CNN.

■

When to explain: Data Acquisition and Understanding.

—

Need 1: Insight and Understanding of Datasets.

–

What to explain: Data Explanations.

–

Corresponding XAI methods: PE, EDA.

–

Examples: Limited research has focused on utilizing XAI methods to comprehend and gain valuable insights into datasets. DSouza et al. [55] employed EDA to analyze and derive insights from the COVID-19 dataset, providing a better understanding of the impacts of the pandemic in relation to the variables and labels within the dataset. Gurumoorthy et al. [82] employed a PE approach to analyze the entire dataset by selecting a representative set of examples and associated weights from the dataset in order to capture the underlying distribution of the entire dataset.

—

Need 2: Analyzing potential errors, noise, and bias in the dataset.

–

What to explain: Data Explanations.

–

Corresponding XAI methods: PE, EDA.

–

Examples: Few studies have applied XAI methods to evaluate the quality of datasets. One notable example is Schmidt [186] that analyzed gender bias in 14 million teaching assessments on the RateMyProfessors.com Web site through EDA.

■

When to explain: Modeling AI.

—

Need 1: Assisting with feature selection.

–

What to explain: How, Why Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF.

–

Examples: The FAE method quantifies the contribution of each input feature to the decision-making process of an ML model [21], making it a popular tool among researchers for feature selection. For example, Ribeiro et al. [176] improved the performance of the ML model by removing unimportant features, based on the interpretation provided by LIME. In [20], DNNs were evaluated using LRP to identify information in the dataset that hindered the training of deep learning models, and removing this information improved the F1 score by 20\(\%\). Yang et al. [240] used FAE for feature selection and improved fairness by eliminating features that lead to unfair decision-making. In addition, many related works [100, 135, 212, 243] used the FAE method based on the Shapley value for feature selection.

—

Need 2: Optimizing model architecture and hyper-parameters.

–

What to explain: How, Why, What Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, MV.

–

Examples: Researchers have leveraged XAI methods, including MV [150, 238, 244], and FAE [81], to assist in fine-tuning the ML model's architecture and hyper-parameters. As an example, Zeiler and Fergus [244] provided insight into the functionality of the ML model's intermediate feature layer and the operation of the classifier by visualizing the ML model and adjusting its architecture accordingly, ultimately improving the ML model's performance.

—

Need 3: Checking the model's decisions.

–

What to explain: How, Why, Why-not, What, What-if, What-else, How-to, How-still Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, MV, PDP.

–

Examples: Researchers have utilized various XAI methods to understand and review black box models, including FAE [26], AE [51], DR [143], DT [173], MV [126], TE [50], and PDP [111], among others. For example, Bojarski et al. [26] used a local FAE method to explain what a DNN learned with end-to-end training and how it makes decisions to drive a car automatically. Liu et al. [126] introduced DGMTracker, a visualization tool that analyzes the training process of deep generative models by visualizing dataflow and examining neuron contributions, providing valuable insights for AI development experts to comprehend and diagnose the training process. Krause et al. [111] developed Prospector, an interactive visual analytics system that provides interactive partial dependency diagnostics through a PDP approach, thereby assisting data scientists in analyzing how features affect the overall prediction.

—

Need 4: Guiding model debugging and error refinement.

–

What to explain: How, Why, Why-not, What Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, MV.

–

Examples: Many existing studies have employed XAI methods, including FAE [15, 20, 34, 69, 76, 163, 222] and AE [51], for debugging models and mitigating biases. For example, Palatnik et al. [163] used various FAE methods, including LIME [176], GradCAM [189], and SmoothGrad [196], to analyze the COVID CT-Scan classifier and successfully identify biases within the ML model. Dong et al. [51] proposed an AE method that associates each neuron with a topic, enhancing the interpretability of DNNs. Several other works [27, 114, 188, 207] have incorporated AI experts into the training loop through human-in-the-loop-based explanatory debugging and used their feedback on the provided explanations to modify and refine the ML model.

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Adjusting the ML model to meet the user's expectations and needs.

–

What to explain: Why, Why-not, What-if, How-to Explanations.

–

Corresponding XAI methods: DT, DR, Local FAE, PE, TE, AE, CF, GE, PDP.

–

Examples: Researchers have used various explanation methods, such as FAE [69, 178], CF [29], and AE [51], to adapt the ML model to better align with stakeholders’ expectations and requirements. For example, Rieger et al. [178] proposed the contextual decomposition explanation penalization (CDEP). When the ML model incorrectly assigns importance to certain features (contrary to stakeholders’ expectations or anticipations), CDEP can correct these errors by explaining the insertion of domain knowledge into the ML model. Brandao and Magazzeni [29] minimally modified the map by CFE in path planning, providing stakeholders with suggestions for map changes that can lead to the desired path becoming optimal. Some other studies adopted a human-in-the-loop interpretive debugging approach [114, 188, 207]. In this iterative process, the system initially explains its predictions to the user, who then provides necessary corrections. This iterative collaboration process enables the ML model to be refined and aligned with the user's expectations.

—

Need 2: Assessing the impact of dataset shift.

–

What to explain: How, Why, Data Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, EDA.

–

Examples: Currently, there is little work on using XAI methods to detect dataset shifts. Among them, Wang et al. [229] used the FAE method combined with two-sample detection to assist in detecting harmful dataset shifts. Masegosa et al. [136] proposed a method for EDA of stream data based on probabilistic graphical models to explore and analyze financial customer data, capturing the sources of different conceptual drifts.

6.3 Model Validation Team

■

When to explain: Data Acquisition and Understanding.

—

Need: Evaluating data suitability.

–

What to explain: Data Explanations.

–

Corresponding XAI methods: PE, EDA.

–

Examples: EDA enables the examination of data distribution, and evaluation of data quality and structure using analytical techniques such as visualization [110], thus helping the model validation teams gain insight into the training data to determine its suitability in relation to their proprietary data.

■

When to explain: Modeling AI.

—

Need 1: Reviewing the ML model's decision logic.

–

What to explain: How, Why, Why-not, What-if, What-else, How-to, How-still Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, PDP.

–

Examples: Gerlings et al. [72] noted that a group of experienced hospital professionals and medical annotators reviewed the output of the ML model through the XAI methods to ensure consistency with their expertise. This thorough evaluation resulted in the removal of two out of the original six pathologies, as their performance did not meet the required standards. Christianto et al. [39] proposed a smart interpretable model framework that requires little AI knowledge. This framework generates a set of fuzzy IF-THEN rules along with their corresponding membership functions, enabling subject matter experts to gain comprehensive insights into the ML model through these rules and sample interpretations.

—

Need 2: Determining compliance with regulations.

–

What to explain: How, Why, Why-not, What-if, What-else Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, PDP.

–

Examples: Existing research has applied XAI methods to meet legal requirements for model interpretability [109, 169], to assess the fairness of models and detect bias in models [149, 165, 234, 247], to evaluate the security of models [156], and so on. For example, to ensure compliance with relevant laws such as GDPR, Kolyshkina and Simoff [109] defined the interpretability of ML solutions and proposed CRISP-ML. CRISP-ML is a detailed step-by-step methodology capable of creating the necessary level of interpretable guidance at each stage of the solution building process, guiding companies to have the ability to interpret their algorithms’ decisions to meet relevant regulations. Wich et al. [234] employed XAI methods to visualize biases in politically biased datasets, thus helping to build unbiased datasets or to de-bias them through the interpreted results. Nguyen and Choo [156] proposed a human-in-the-loop framework that integrated security analysts and forensic investigators into the human-computer loop by leveraging XAI methods to assist in vulnerability detection, investigation, and mitigation, ultimately enhancing the security of the ML model.

6.4 Model Operators

■

When to explain: Deployment, Monitoring and Interaction.

—

Need: Ensuring correct and efficient interaction.

–

What to explain: How, Why, What-if, What-else Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, PDP.

–

Examples: Zhang et al. [246] introduced and discussed how to use XAI methods (including global DTs, local FAE, PDP, local DRs, and CF) to audit practitioners, guiding them to interact properly with AI tools and meet audit documentation and audit evidence standards. Furthermore, Zhang et al. [249] used the proposed Deep-SHAP method in a power system emergency control scenario, enabling operators to comprehend the underlying mechanisms of deep reinforcement learning models and enhancing their efficiency in interacting with the ML models.

6.5 Decision Makers

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Comprehending specific model decisions.

–

What to explain: Why, Why-not, What-else Explanations.

–

Corresponding XAI methods: DT, DR, Local FAE, PE, TE, AE, CF, GE.

–

Examples: Existing studies have primarily focused on using local FAE methods to help decision makers understand specific model decisions [1, 44, 59, 131, 166, 180, 184, 202, 226, 239, 241]. For example, Lundberg et al. [131] developed an ML system capable of predicting the risk of hypoxemia while explaining risk factors during anesthesia in real-time. It was demonstrated experimentally that the system's explanations were broadly consistent with the prior knowledge of anesthesiologists and that the explanations improved anesthesiologist performance. Monteath and Sheh [147] proposed a new XAI approach that used global DTs to provide incremental decision support for medical diagnosis. Their approach promotes collaboration between AI systems and subject matter experts, facilitating mutual information exchange and joint decision-making. Their system is able to guide physicians in determining which test results are most useful based on available data. Kavya et al. [104] extracted rules from random forests and presented them in the IF-THEN format to aid clinical diagnosticians when comprehending model decisions.

—

Need 2: Deepening overall understanding of the ML model and improving decisions.

–

What to explain: How Explanations.

–

Corresponding XAI methods: Global DT, Global DR, Global FAE, GE.

–

Examples: Existing work that helped decision makers understand the overall logic of the ML model focused on using global FAE methods [1, 35, 166, 224, 226]. Among them, Wang et al. [226] used a global FAE to extract significant features from intrusion detection systems. By illustrating the association between feature values and various types of attacks, their method enabled security experts to gain a deeper understanding of intrusion detection system design. Rad et al. [173] proposed a new XAI method to extract surrogate DTs from black-box models to explain the temporal importance of clinical features in predicting kidney graft survival globally, thus helping physicians understand the decision logic behind the black-box models.

6.6 Regulators

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Ancillary model review.

–

What to explain: How, Why, Why-not, What-if, What-else Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, PDP.

–

Examples: Adler et al. [3] proposed an XAI method to evaluate the fairness of black box models from the perspective of an external reviewer. They examined the extent to which the black box model utilizes specific sensitive attributes (e.g., skin color, gender) in the dataset. Similar work [149, 165, 234, 247] has also used XAI methods to reveal potential biases in the ML models as well as to assess the fairness of the ML models. However, no existing research has been found that specifically applies XAI methods for evaluating the safety and robustness of the ML models.

—

Need 2: Assisting in apportioning responsibility.

–

What to explain: Why Explanations.

–

Corresponding XAI methods: DT, DR, Local FAE, PE, TE, AE, CF, GE.

–

Examples: Jafta et al. [95] emphasized the potential of XAI methods in understanding the decision-making process of models, identifying relevant variables, and distinguishing between morally right and wrong decisions. These capabilities not only contribute to uncovering the truth but also facilitate accountability and the prevention of future malpractices. Lima et al. [124] explored whether and to what extent explainability can assist in addressing the accountability problem posed by autonomous AI systems. However, despite these discussions, we have not encountered any specific examples or instances where XAI methods have been applied directly to promote accountability.

6.7 Data Subjects

■

When to explain: Modeling AI; Deployment, Monitoring and Interaction.

—

Need: Protecting personal data information.

–

What to explain: How, Why Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF.

–

Examples: EU regulations grant data subjects the right to obtain an explanation of how the ML model uses their data to make decisions [8]. The GDPR [214] safeguards the personal data of all EU residents, regardless of the place of processing, and it gives EU residents the right to access, rectify, delete, and restrict the processing of their personal data. However, our investigation has not found any relevant research that uses XAI methods to disclose what information about data subjects is used by the ML models in their decisions and how the data of data subjects affect the ML models’ decisions.

6.8 Subjects of Decision

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Understanding the decision and how to maintain or change it.

–

What to explain: Why, Why-not, What-if, What-else, How-to, How-still Explanations.

–

Corresponding XAI methods: DT, DR, Local FAE, PE, TE, AE, CF, GE, PDP.

–

Examples: An example provided in the AIX360 toolkit [12] uses the CF method to explain why a bank customer's loan was denied and what modifications should be made to get the loan approved. The XAI approach proposed in [147] used DTs to provide incremental decision support for medical diagnosis. The system was able to explain how a particular decision was made and trace it directly back to the underlying training data, thus helping patients understand the system's decisions as well as increasing their confidence in the system's decisions.

—

Need 2: Examining bias.

–

What to explain: How, Why, Why-not, What-if Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, PDP.

–

Examples: Wang et al. [230] used FAE to portray the decision-making process of the ML model in order to assess the ML model's procedural fairness. Malhi et al. [133] conducted an HCI study to investigate the effect of explanation on the biases introduced by human participants in human-computer decision-making. The final experimental results found that users derived benefits from explanations, had increased trust, and were able to reduce bias in human decision-making. Similar research in assessing model fairness [149, 165, 234, 247] can also be leveraged to assist subjects of the decision in detecting or even mitigating bias and discrimination in the ML models.

6.9 General End-Users

■

When to explain: Deployment, Monitoring and Interaction.

—

Need 1: Establishing appropriate trust and building mental models.

–

What to explain: How, Why, Why-not, What-if, What-else Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF, PDP.

–

Examples: Researchers have employed various XAI methods to help general end-users develop trust and construct accurate mental models. For example, Kulesza et al. [114] proposed an explanatory debugging approach where the system explains to the user how it made each prediction, and the user then explains to the learning system any necessary corrections that can help the end-user build an appropriate mental model. In [236], interviews and user research conducted in the context of autonomous driving revealed that XAI visualization depicting detected objects and their predicted action significantly enhanced users’ comprehension and trust in the ML model, thereby facilitating users’ mental model development. Van der Waa et al. [215] proposed a framework for an interpretable confidence measure (ICM). They demonstrated through experiments with end-users using self-driving cars that the ICM can provide users with an interpretable confidence value for case-based reasoning about decision outcomes. This can effectively guide users on whether to trust the ML model's decision recommendations.

—

Need 2: Protecting personal data information.

–

What to explain: How, Why Explanations.

–

Corresponding XAI methods: DT, DR, FAE, GE, PE, TE, AE, CF.

–

Examples: Some researchers have developed various types of visualization tools, such as Floodwatch [159] and Behind the Banner [16], to inform users about the data advertisers may collect. However, no research has been conducted to explain how XAI methods can be used to protect users’ private data information.

The above detailed discussions can be summarized by Table 3, which gives a global picture of the relationships among different stakeholders, their needs for explanation at different stages of the AI system lifecycle, different types of explanations, and existing methods that could be used to generate explanations. Table 3 is a vital part of the roadmap and serves as a guideline to help the stakeholders in selecting appropriate XAI methods to fulfill their needs for explanation.

Table 3.

Explain to Whom	Need	What to Explain	How to Explain	Examples
AI theory experts	Insight and understanding of the internal logic of complex ML models	What	MV, GE, ItAE	[17, 61, 62, 90, 101, 157, 168, 199, 242, 244, 248, 250–255]
AI development experts	Comparing analysis of multiple ML models	How, why, what	DT, DR, FAE, GE, PE, TE, AE, CF, MV	[65, 103, 139, 150, 155, 160]
	Insight and understanding of datasets	Data	PE, EDA	[55, 82]
	Analyzing potential errors, noise, and bias in the dataset	Data	PE, EDA	[186]
	Assisting with feature selection	How, why	DT, DR, FAE, GE, PE, TE, AE, CF	[20, 100, 105, 135, 148, 152, 176, 206, 212, 216, 240, 243]
	Optimizing model architecture and hyperparameters	How, why, what	DT, DR, FAE, GE, PE, TE, AE, CF, MV	[77, 81, 150, 197, 203, 238, 244]
	Checking the model's decisions	How, why, why-not, what, what-if, what-else, how-to, how-still	DT, DR, FAE, GE, PE, TE, AE, CF, MV, PDP	[19, 26, 50, 51, 106, 111, 126, 143, 173]
	Guiding model debugging and error refinement	How, why, why-not, what	DT, DR, FAE, GE, PE, TE, AE, CF, MV	[4, 15, 19, 20, 34, 51, 69, 76, 114, 162, 163, 170, 188, 198, 200, 207, 222]
	Adjusting the ML model to meet the user's expectations and needs	Why, why-not, what-if, how-to	DT, DR, Local FAE, PE, TE, AE, CF, GE, PDP	[22, 29, 51, 69, 114, 178, 188, 207]
	Assessing the impact of dataset shift	How, why, data	DT, DR, FAE, GE, PE, TE, AE, CF, EDA	[32, 38, 136, 229]
Model validation team	Evaluating data suitability	Data	PE, EDA	[110]
	Reviewing the ML model's decision logic	How, why, why-not, what-if, what-else, how-to, how-still	DT, DR, FAE, GE, PE, TE, AE, CF, PDP	[39, 46, 113, 127]
	Determining compliance with regulations	How, why, why-not, what-if, what-else	DT, DR, FAE, GE, PE, TE, AE, CF, PDP	[109, 149, 156, 165, 169, 234, 247]
Model operators	Ensuring correct and efficient interaction	How, why, what-if, what-else	DT, DR, FAE, GE, PE, TE, AE, CF, PDP	[134, 174, 194, 246, 249]
Decision makers	Comprehending specific model decisions	Why, why-not, what-else	DT, DR, Local FAE, PE, TE, AE, CF, GE	[1, 44, 59, 66, 104, 131, 147, 166, 180, 184, 202, 226, 239, 241]
Decision makers	Deepening overall understanding of the ML model and improving decisions	How	Global DT, Global DR, Global FAE, GE	[1, 35, 44, 166, 173, 224, 226]
Regulators	Ancillary model review	How, why, why-not, what-if, what-else	DT, DR, FAE, GE, PE, TE, AE, CF, PDP	[3, 149, 165, 234, 247]
Regulators	Assisting in apportioning responsibility	Why	DT, DR, Local FAE, PE, TE, AE, CF, GE	None
Data subjects	Protecting personal data information	How, why	DT, DR, FAE, GE, PE, TE, AE, CF	None
Subjects of decision	Understanding the decision and how to maintain or change it	Why, why-not, what-if, what-else, how-to, how-still	DT, DR, Local FAE, PE, TE, AE, CF, GE, PDP	[12, 147]
Subjects of decision	Examining bias	How, why, why-not, what-if	DT, DR, FAE, GE, PE, TE, AE, CF, PDP	[40, 133, 149, 165, 230, 234, 247]
General end-users	Establishing appropriate trust and building mental models	How, why, why-not, what-if, what-else	DT, DR, FAE, GE, PE, TE, AE, CF, PDP	[25, 54, 86, 114, 172, 215, 232, 236]
General end-users	Protecting personal data information	How, why	DT, DR, FAE, GE, PE, TE, AE, CF	None

Table 3. A Roadmap's Guideline to Assist Stakeholders in Selecting Appropriate XAI Methods to Meet Their Needs for Explanation

It uses the “what to explain” questions as a bridge to connect the stakeholders’ needs for explanation at different stages with XAI methods. An underline below the XAI method indicates that the XAI method has been used in the given examples to meet the corresponding need of the stakeholder.

Furthermore, from Table 3 we can identify which XAIs are more universally applicable to the vast majority of stakeholders. Specifically, PE is the most generalized XAI method, capable of addressing 21 out of 23 different stakeholder needs, except for challenges in helping AI theory experts and decision makers understand the model. This finding aligns with Humer et al. [93], who compared several XAI methods and found that PE was the most helpful XAI method for human decision-making. Additionally, DT, DR, FAE, CF, TE, GE, and AE are highly versatile XAI methods, each capable of addressing over 75% of these stakeholder needs.

In addition, it is clear from Table 3 what has been accomplished and what research gaps remain, which helps us to gauge the state-of-the-art in XAI in a more comprehensive context. As an example of gaps in research, AI development experts in AASs could benefit from (1) gaining insights into the overall behavior of an ML model through global XAI methods such as Global DT, Global DR, or GE, aiding in model comparison in adaptive environments, (2) assisting in feature selection by identifying important features for decision-making in real-time autonomous systems through GT or DR, and (3) using MV visualization techniques for debugging models, particularly for autonomous systems that need to react to unforeseen changes dynamically.

7 Case Studies

After providing the roadmap to bridge stakeholder needs and existing XAI methods in Section 6, this section demonstrates how the roadmap's guideline (Table 3) can be utilized to assist stakeholders in fulfilling their needs through two hypothetical case studies. Specifically, we focus on autonomous driving and financial loan scenarios within the context of AASs, which are representative of real-world applications [25, 31, 36].

7.1 Autonomous Driving Scenario

In an autonomous driving scenario, the adaptive driving system uses ML models to assist in making decisions such as route optimization, obstacle detection, and vehicle speed adjustments. This scenario involves two primary stakeholders: the vehicle driver (general end-users) and the traffic safety officer (regulators). The following section describes their needs for explanations at a given stage, the questions they may ask, and how the corresponding existing XAI technologies can answer these questions to fulfill their needs. An overview is shown in Figure 3.

Fig. 3.

Stakeholder 1:

—

Whom: General end-users (a vehicle driver).

—

When: Deployment, Monitoring and Interaction.

—

Need: Establishing appropriate trust and building mental models. This vehicle driver expects to understand why the autonomous driving system makes particular decisions, facilitating the development of trust in its operation.

—

What: According to Table 3, when general end-users have the need for “establishing appropriate trust and building mental models,” the questions they may ask include “How,” “Why,” “Why-not,” “What-if,” and “What-else.” In this scenario, we assume that the vehicle driver asks the following question:

(1)

Why does the system make a particular decision?

—

How: Table 3 indicates that eight XAI methods (“DT,” “DR,” “Local FAE,” “PE,” “TE,” “AE,” “CF,” and “GE”) could potentially address the question posed to fulfill that need. In this scenario, we use the following XAI method to answer the “Why” question asked by the vehicle driver:

(1)

Local FAE is used in this scenario to answer the “Why” question. This method helps the vehicle driver understand the autonomous driving system's decisions by highlighting key information, such as lane markings, surrounding vehicles, and obstacles, that influence the system's actions.

Stakeholder 2:

—

Whom: Regulators (a traffic safety officer).

—

When: Deployment, Monitoring and Interaction.

—

Need: Ancillary model review. This traffic safety officer needs the help of XAI technologies to assist him in reviewing the compliance of the autonomous driving system with traffic safety regulations.

—

What: According to Table 3, when regulators have the need for “ancillary model review,” the questions they may ask include “How,” “Why,” “Why-not,” “What-if,” and “What-else.” In this scenario, we assume that the traffic safety officer asks the following question:

(1)

How does the system work as a whole?

—

How: Table 3 indicates that four XAI methods (“Global DT,” “Global DR,” “Global FAE,” and “GE”) could potentially address the question posed to fulfill that need. In this scenario, we use the following XAI method to answer the “How” question asked by the traffic safety officer:

(1)

Global DT is used in this scenario to answer the “How” question. This approach reveals the overall decision-making rules of the autonomous driving system and helps the traffic safety officer understand the system's adherence to traffic rules.

7.2 Financial Loan Scenario

In this financial loan process, the autonomous financial system adapts its evaluation based on external market trends and internal user data, involving two primary stakeholders: bank loan officers (decision makers) and loan customers (subjects of decision). When a customer applies for a loan, the autonomous financial system analyzes factors such as market conditions and the applicant's financial history, providing the bank loan officer with recommendations. With the assistance of these recommendations, the officer arrives at the final decision.

In this scenario, we assume two main stakeholders, their needs for explanations at a given stage, the questions they might ask, and accordingly which existing XAI technologies answer those questions to fulfill their needs. This is described further below, while the overview is shown in Figure 4.

Fig. 4.

Stakeholder 1:

—

Whom: Decision Makers (a bank loan officer).

—

When: Deployment, Monitoring and Interaction.

—

Need: Comprehending specific model decisions. This bank loan officer relies on the adaptive financial system to assist in deciding whether to approve a loan. To make an informed decision, the officer seeks to understand how the system arrives at its conclusions, which is crucial for determining whether to trust the system's recommendations.

—

What: According to Table 3, when decision-makers have the need for “comprehending specific model decisions,” the questions they may ask include “Why,” “Why-not,” and “What-else.” In this scenario, we assume that the bank loan officer asks the following two questions:

(1)

Why does the system make a particular decision?

(2)

What else are the similar instances?

—

How: Table 3 indicates that eight XAI methods (“DT,” “DR,” “Local FAE,” “PE,” “TE,” “AE,” “CF,” and “GE”) could potentially address the questions posed to fulfill that need. In this scenario, we use the following two XAI methods to answer the “Why” and “What-else” questions asked by the bank loan officer, respectively:

(1)

Local FAE is used in this scenario to answer the “Why” question. The local FAE helps this bank loan officer understand the adaptive financial system's specific decisions by revealing the extent to which each piece of information (input feature) about a given customer contributes to the system's decisions.

(2)

PE is used in this scenario to answer the “What-else” question. The PE provides some decision cases with similar inputs and outputs to the customer being explained, helping this bank loan officer better understand and trust the adaptive financial system's decisions.

Stakeholder 2:

—

Whom: Subjects of Decision (a bank loan customer).

—

When: Deployment, Monitoring and Interaction.

—

Need: Understanding the decision and how to maintain or change it. This bank loan customer's loan application was rejected and he seeks to understand why and how he can improve his chances of approval in the future.

—

What: According to Table 3, when subjects of decision have the need for “understanding the decision and how to maintain or change it,” the questions they may ask include “Why,” “Why-not,” “What-if,” “What-else,” “How-to,” and “How-still.” In this scenario, we assume that the bank loan customer asks the following question:

(1)

How to let the system make another particular decision?

—

How: Table 3 indicates that nine XAI methods (“DT,” “DR,” “Local FAE,” “PE,” “TE,” “AE,” “CF,” “GE,” and “PDP”) could potentially address the questions posed to fulfill that need. In this scenario, we use the following XAI method to answer the “How-to” question asked by the bank loan customer:

(1)

CFE is used in this scenario to answer the “How-to” question. CFE provides explanations by informing the bank loan customer of the factors that led to the denial of his or her loan application, helping the customer to understand the adaptive financial system's decision-making. It also suggests areas for improvement, increasing the likelihood of future loan approvals.

This section demonstrates the practical application of the roadmap's guideline (Table 3) by showcasing how stakeholders can select appropriate XAI methods to meet their needs in real-world scenarios. Using the examples of autonomous driving and financial loan scenarios, we illustrate how general end-users, regulators, decision makers, and subjects of decision can utilize XAI technologies to gain insights into ML systems. This process enhances transparency, builds trust, and supports informed decision-making. The case studies serve as a guide for readers on how to effectively apply the roadmap's guideline (Table 3) to select the appropriate XAI method to meet various stakeholders’ needs for explanation.

8 Discussion

Table 3 has led to a number of important observations that reveal gaps in existing research in achieving the goal of XAI to provide human users with understandable interpretations [10] and point out promising directions for future research. These observations are discussed in this section.

8.1 Imbalanced Consideration of Different Stakeholders

Table 3 has made one thing very clear: While there is much research about XAI methods for some stakeholders (e.g., AI development experts), there is a distinct lack of research to meet the needs of other stakeholders. In fact, no XAI methods have been developed to cater for the needs of regulators, data subjects, and general end-users. This observation aligns with the findings reported elsewhere [22, 96, 205]. However, these stakeholders play a vital role in both the ML ecosystems and AASs. For example, in AASs such as autonomous vehicles, regulators are critical in ensuring compliance with safety standards, while end-users should trust the system's decisions. Therefore, future research efforts should prioritize providing appropriate XAI methods to these stakeholders.

8.2 Imbalanced Consideration of Different Research Directions

If we examine the types of explanations generated by existing XAI methods and the questions they tried to answer, we find that there are relatively few studies of XAI methods in the following four directions.

8.2.1 XAI Methods for Interpretation of Data.

Although many stakeholders express a desire to gain insight into data through XAI methods, including assisting AI development experts to understand data characteristics, size, distribution, noise, and bias before modeling [72], helping AI development experts and model validation teams to check for data shift and its possible impact [22, 47], and assisting in protecting data information of data subjects and general end-users [145, 205, 210], our investigation has revealed a scarcity of studies that utilize XAI methods for the above purposes, particularly when excluding the traditional data analysis technique of EDA.

We find that the aforementioned scarcity can largely be attributed to the limited availability of XAI methods focused on providing insights into the data, as researchers have predominantly focused on designing XAI methods for explaining ML models. Notably, Wang et al. [229] utilized FAE to detect data shifts. However, this approach requires the aid of an ML model and can only be used to detect data shifts that are detrimental to model performance. Another relevant contribution is the ProtoDash approach proposed in [82]. This approach is able to generalize and compactly represent prototype examples of potentially complex data distributions by mining information in datasets and performing prototype selection to convey meaningful insights to humans in domains where simple explanations are difficult to extract. However, overall, the scarcity of XAI methods for data interpretation remains evident. There is a pressing need for the development of XAI methods that enable deeper insights into data.

8.2.2 XAI Methods for Reviewing Model Compliance.

The current regulatory landscape has witnessed governments strengthening and implementing regulations concerning the risks associated with the application of AI systems. For example, the “Ethics Guidelines for Trustworthy AI” [41] developed by the EU's High-Level Expert Group on Artificial Intelligence emphasizes the necessity for trustworthy AI systems to adhere to requirements such as algorithmic fairness, non-discrimination, privacy protection, security, transparency, and interpretability. These requirements are particularly relevant in AASs like autonomous driving or autonomous drones, where non-compliance can pose serious risks. Research has shown the desire of AI development experts to utilize XAI methods to identify and mitigate model bias, while model validation teams and regulators seek XAI methods to assist in reviewing model compliance [47, 210].

Despite the continuous enhancement of regulatory requirements for AI systems and the corresponding practical needs of stakeholders, there remains a paucity of research on how to effectively implement these requirements with the support of XAI methods. This observation was also made in [73], which stated that “the literature greatly suggests that XAI will ease this barrier, however, there has not yet been any significant empirical studies showing how XAI in fact may satisfy regulators need to ensure compliance or undergo audits.” However, there were some preliminary attempts at trying to review the security [156], fairness [3, 149, 165, 234, 247], transparency [57, 109, 169], and so on. of the ML model through XAI methods.

8.2.3 XAI Methods for Accountability.

Much of the literature [10, 94, 95, 124, 145, 191] stated that a major purpose of XAI is to enable accountability in AI, facilitating the attribution of responsibility to human agents involved in its development, deployment, and interaction. For example, in autonomous healthcare systems, understanding who is responsible for a wrong diagnosis is vital. Unfortunately, our survey revealed no relevant research related to assisting or enabling accountability through XAI. Consequently, while the idea of achieving accountability through XAI is highly attractive, the actual feasibility and research progress remain unclear. One important future work would be to clarify whether, how, and to what extent XAI methods can assist in accountability.

8.2.4 XAI Methods to Help Subjects of Decision Understand the ML Model.

An interesting finding from our survey is that although AI experts, subject matter experts, and subjects of decision all express a desire to understand model decisions through XAI methods, there is little research work catering the needs of subjects of decision. For example, in the medical field, where XAI methods are widely used, many studies [44, 59, 180, 202] have focused on assisting physicians in understanding model decisions, but few studies [147] have addressed the needs of patients. This is a very important future research direction if AI is to be applied in areas that affect human lives.

8.3 Imbalance in the Use of Different XAI Methods

Through our comprehensive survey, we have observed that although researchers have proposed numerous XAI methods, there is a wide variation in the frequency of using different XAI methods. In particular, the FAE method [195] emerges as the predominant choice in many studies.

(1)

We counted the distribution of XAI methods utilized in the 118 studies related to applying XAI methods to meet stakeholders’ needs used in Table 3, and the results are shown in Figure 5. From this figure, we can see that the FAE method was employed in a significant 68 of these studies, the second-ranked MV method was used in only 13 cases, and the PE method, which has the potential to meet the needs of the vast majority of stakeholders, was applied in only four cases. Even the sum of the number of applications of all other XAI methods is not as high as FAE. This variation underscores the substantial reliance on the FAE method, with other XAI methods being considerably underutilized in the existing literature.

(2)

Focusing on a specific application scenario that has been extensively researched, namely helping decision makers in “comprehending specific model decisions,” many XAI methods could theoretically fulfill that need, including DT, DR, FAE, PE, TE, GE methods, and so on. However, for a total of 14 relevant papers listed in Table 3, except for one paper that explored the role of XAI from the decision maker's perspective and one paper used the DR method, all remaining 12 papers relied on the FAE method for comprehending model-specific decisions (with one study utilizing both FAE and PDP methods).

Fig. 5.

It is clear that there is a pronounced imbalance in the application of various XAI methods. While the preference for the FAE method may be attributed to its reputation as the most extensively studied explainability method to date [10, 22], it is essential to recognize the importance of incorporating different XAI methods to create a comprehensive explanatory framework because different XAI methods can reveal different aspects of ML models. For example, in AASs, where different methods may be required to explain different aspects of a system's behavior (e.g., safety vs. efficiency in an autonomous vehicle). It would be interesting in the future to see more studies using a diverse set of XAI methods to give a fuller explanation of ML models to different stakeholders.

8.4 Unclear Stakeholders When Discussing XAI Methods

A crucial oversight when discussing XAI methods in the current literature is the lack of clarity on who the appropriate stakeholders are in using these XAI methods [22, 72]. Our survey revealed that many studies focusing on the application of XAI methods have also failed to explicitly clarify their intended stakeholders. For example, there are studies that use existing XAI methods to examine the presence of bias in the ML models and to assess their fairness [149, 165, 234], yet they do not specify which stakeholders their methods are intended for. Yet, it is important to recognize that although AI experts, subject matter experts, external entities, and general end-users all have a need to review and assess the fairness of models, they actually require different explanations because they have different AI expertise and domain-specific knowledge [52, 72, 145, 179]. As a result, some research on XAI methods may struggle to effectively address stakeholders’ needs. It is essential in our future research of XAI methods to be explicitly clear about the stakeholders, their needs at different stages of the AI system lifecycle, and the types of question that the XAI methods would answer. Hopefully this article, especially Table 3, could serve as a roadmap and guideline for people to conduct their future research.

9 Conclusion

Existing review papers on XAI methods tend to focus on techniques and algorithms to generate explanations of ML models, less attention has been paid to key questions such as: Who will use the XAI method? To whom does the method explain to? What types of explanation does the XAI method provide at what stage of the AI system lifecycle? Does the explanation meet a stakeholder's need? What are the XAI methods that have been used to meet a stakeholder's need? and so on. This article gives a more holistic and comprehensive review of existing literature on XAI by focusing on the four pillars in a common framework (Figure 1)—who explains to whom at which stage of the AI system lifecycle about what and for what purposes? The emphasis of this article is not on any individual pillars, but on the essential relationships among them.

First, the article presents an ML ecosystem in which the roles of different stakeholders can be clearly described. Second, the AI system lifecycle is used to clarify different explanation needs at different stages by different stakeholders. Third, different types of explanations are discussed, linking the discussions to different needs by stakeholders at different stages of the AI system lifecycle. Fourth, the article summarizes the XAI methods proposed in the literature. Fifth, the article provides a comprehensive mapping from XAI methods to stakeholders’ needs, capturing the key information in Table 3 and providing a roadmap to assist stakeholders in selecting the appropriate XAI method to meet their needs for explanation. Finally, based on our review, major gaps in the literature and future research directions are pointed out. It is important to clarify the target stakeholders for XAI methods during their design and application. Additionally, future research should prioritize addressing the identified gaps by focusing on specific stakeholders, research directions, and XAI methods that remain under-explored.

References

[1]

Zakaria Abou El Houda, Bouziane Brik, and Lyes Khoukhi. 2022. ‘Why should I trust your IDS?’: An explainable deep learning framework for intrusion detection systems in internet of things networks. IEEE Open Journal of the Communications Society 3 (2022), 1164–1176.

Abstract

1 Introduction

2 Four-Pillar Structure of Our Roadmap

2.1 Explain to Whom

2.2 When to Explain

2.3 What to Explain

2.4 How to Explain

3 When to Explain, to Whom, and to Meet What Need

3.1 AI System Lifecycle under Consideration

3.2 Identify Stakeholders and Their Needs for Explanation at Different Stages

3.2.1 AI Theory Experts.

3.2.2 AI Development Experts.

3.2.3 Model Validation Team.

3.2.4 Model Operators.

3.2.5 Decision Makers.

3.2.6 Regulators.

3.2.7 Data Subjects.

3.2.8 Subjects of Decision.

3.2.9 General End-Users.

4 What to Explain

5 How to Explain

6 Bridging Stakeholders’ Needs and XAI Methods: A Roadmap and Guideline

6.1 AI Theory Experts

6.2 AI Development Experts

6.3 Model Validation Team

6.4 Model Operators

6.5 Decision Makers

6.6 Regulators

6.7 Data Subjects

6.8 Subjects of Decision

6.9 General End-Users

7 Case Studies

7.1 Autonomous Driving Scenario

7.2 Financial Loan Scenario

8 Discussion

8.1 Imbalanced Consideration of Different Stakeholders

8.2 Imbalanced Consideration of Different Research Directions

8.2.1 XAI Methods for Interpretation of Data.

8.2.2 XAI Methods for Reviewing Model Compliance.

8.2.3 XAI Methods for Accountability.

8.2.4 XAI Methods to Help Subjects of Decision Understand the ML Model.

8.3 Imbalance in the Use of Different XAI Methods

8.4 Unclear Stakeholders When Discussing XAI Methods

9 Conclusion

References

Index Terms

Recommendations

Explainable Artificial Intelligence: Requirements for Explainability

Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI

A Review of Taxonomies of Explainable Artificial Intelligence (XAI) Methods

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations