More Web Proxy on the site http://driver.im/

research-article

Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End

Authors:

Ramaravind Kommiya Mothilal,

Divyat Mahajan,

Amit SharmaAuthors Info & Claims

AIES '21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

Pages 652 - 663

https://doi.org/10.1145/3461702.3462597

Published: 30 July 2021 Publication History

Abstract

Feature attributions and counterfactual explanations are popular approaches to explain a ML model. The former assigns an importance score to each input feature, while the latter provides input examples with minimal changes to alter the model's predictions. To unify these approaches, we provide an interpretation based on the actual causality framework and present two key results in terms of their use. First, we present a method to generate feature attribution explanations from a set of counterfactual examples. These feature attributions convey how important a feature is to changing the classification outcome of a model, especially on whether a subset of features is necessary and/or sufficient for that change, which attribution-based methods are unable to provide. Second, we show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency. As a result, we highlight the complimentary of these two approaches. Our evaluation on three benchmark datasets --- Adult-Income, LendingClub, and German-Credit --- confirms the complimentary. Feature attribution methods like LIME and SHAP and counterfactual explanation methods like Wachter et al. and DiCE often do not agree on feature importance rankings. In addition, by restricting the features that can be modified for generating counterfactual examples, we find that the top-k features from LIME or SHAP are often neither necessary nor sufficient explanations of a model's prediction. Finally, we present a case study of different explanation methods on a real-world hospital triage problem.

References

[1]

[n.d.]. ([n. d.]).

[2]

Accessed 2019. UCI Machine Learning Repository. German credit dataset. https://archive.ics.uci.edu/ml/support/statlog+(german+credit+data)

[3]

Kjersti Aas, Martin Jullum, and Anders Løland. 2021. Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artificial Intelligence (2021), 103502.

[4]

David Alvarez-Melis and Tommi S Jaakkola. 2018. On the robustness of interpretability methods. arXiv preprint arXiv:1806.08049 (2018).

[5]

Rajiv Arya, Grant Wei, Jonathan V McCoy, Jody Crane, Pamela Ohman-Strickland, and Robert M Eisenstein. 2013. Decreasing length of stay in the emergency department with a split emergency severity index 3 patient flow model. Academic Emergency Medicine, Vol. 20, 11 (2013), 1171--1179.

[6]

Vijay Arya, Rachel KE Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C Hoffman, Stephanie Houde, Q Vera Liao, Ronny Luss, Aleksandra Mojsilović, et al. 2019. One explanation does not fit all: A toolkit and taxonomy of ai explainability techniques. arXiv preprint arXiv:1909.03012 (2019).

[7]

Solon Barocas, Andrew D Selbst, and Manish Raghavan. 2020. The hidden assumptions behind counterfactual explanations and principal reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 80--89.

Digital Library

[8]

Samuel Carton, Anirudh Rathore, and Chenhao Tan. 2020. Evaluating and Characterizing Human Rationales. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 9294--9307.

[9]

Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1721--1730.

Digital Library

[10]

Susanne Dandl, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020 a. Multi-objective counterfactual explanations. In International Conference on Parallel Problem Solving from Nature. Springer, 448--469.

Digital Library

[11]

Susanne Dandl, Christoph Molnar, Martin Binder, and Bernd Bischl. 2020 b. Multi-objective counterfactual explanations. In International Conference on Parallel Problem Solving from Nature. Springer, 448--469.

Digital Library

[12]

Thomas Desautels, Jacob Calvert, Jana Hoffman, Melissa Jay, Yaniv Kerem, Lisa Shieh, David Shimabukuro, Uli Chettipally, Mitchell D Feldman, Chris Barton, et al. 2016. Prediction of sepsis in the intensive care unit with minimal electronic health record data: a machine learning approach. JMIR medical informatics, Vol. 4, 3 (2016), e28.

[13]

Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 4443--4458.

[14]

Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, and Payel Das. 2018. Explanations based on the missing: Towards contrastive explanations with pertinent negatives. In Advances in Neural Information Processing Systems. 592--603.

[15]

Andrea Freyer Dugas, Thomas D Kirsch, Matthew Toerper, Fred Korley, Gayane Yenokyan, Daniel France, David Hager, and Scott Levin. 2016. An electronic emergency triage system to improve patient distribution by critical outcomes. The Journal of emergency medicine, Vol. 50, 6 (2016), 910--918.

[16]

Sainyam Galhotra, Romila Pradhan, and Babak Salimi. 2021. Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals. arXiv preprint arXiv:2103.11972 (2021).

[17]

Julian S Haimovich, Arjun K Venkatesh, Abbas Shojaee, Andreas Coppi, Frederick Warner, Shu-Xia Li, and Harlan M Krumholz. 2017. Discovery of temporal and disease association patterns in condition-specific hospital utilization rates. PloS one, Vol. 12, 3 (2017), e0172049.

[18]

Joseph Y Halpern. 2016. Actual causality .MIT Press.

Digital Library

[19]

Woo Suk Hong, Adrian Daniel Haimovich, and R Andrew Taylor. 2018. Predicting hospital admission at emergency department triage using machine learning. PloS one, Vol. 13, 7 (2018), e0201016.

[20]

Steven Horng, David A Sontag, Yoni Halpern, Yacine Jernite, Nathan I Shapiro, and Larry A Nathanson. 2017. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PloS one, Vol. 12, 4 (2017), e0174708.

[21]

Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera. 2020. Model-agnostic counterfactual explanations for consequential decisions. In International Conference on Artificial Intelligence and Statistics. PMLR, 895--905.

[22]

Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. 2021. Algorithmic recourse: from counterfactual explanations to interventions. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 353--362.

Digital Library

[23]

Janis Klaise, Arnaud Van Looveren, Giovanni Vacanti, and Alexandru Coca. 2019. Alibi: Algorithms for monitoring and explaining machine learning models. https://github.com/SeldonIO/alibi

[24]

Ronny Kohavi and Barry Becker. 1996. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/adult

[25]

Alex Kulesza and Ben Taskar. 2012. Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083 (2012).

[26]

I Elizabeth Kumar, Suresh Venkatasubramanian, Carlos Scheidegger, and Sorelle Friedler. 2020. Problems with Shapley-value-based explanations as feature importance measures. In International Conference on Machine Learning. PMLR, 5491--5500.

[27]

Vivian Lai, Zheng Cai, and Chenhao Tan. 2019. Many Faces of Feature Importance: Comparing Built-in and Post-hoc Feature Importance in Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 486--495.

[28]

Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 29--38.

Digital Library

[29]

Himabindu Lakkaraju, Stephen H Bach, and Jure Leskovec. 2016. Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1675--1684.

Digital Library

[30]

Scott Levin, Matthew Toerper, Eric Hamrock, Jeremiah S Hinson, Sean Barnes, Heather Gardner, Andrea Dugas, Bob Linton, Tom Kirsch, and Gabor Kelen. 2018. Machine-learning-based electronic triage more accurately differentiates patients with respect to clinical outcomes compared with the emergency severity index. Annals of emergency medicine, Vol. 71, 5 (2018), 565--574.

[31]

Q Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--15.

Digital Library

[32]

Peter Lipton. 1990. Contrastive explanation. Royal Institute of Philosophy Supplements, Vol. 27 (1990), 247--266.

[33]

Zachary C Lipton. 2018. The mythos of model interpretability. Queue, Vol. 16, 3 (2018), 31--57.

Digital Library

[34]

Yin Lou, Rich Caruana, and Johannes Gehrke. 2012. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 150--158.

Digital Library

[35]

Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in neural information processing systems. 4765--4774.

[36]

Divyat Mahajan, Chenhao Tan, and Amit Sharma. 2019. Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprint arXiv:1912.03277 (2019).

[37]

Tim Miller. 2018. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence (2018).

[38]

Karel GM Moons, Andre Pascal Kengne, Mark Woodward, Patrick Royston, Yvonne Vergouwe, Douglas G Altman, and Diederick E Grobbee. 2012. Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio) marker. Heart, Vol. 98, 9 (2012), 683--690.

[39]

Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 607--617.

Digital Library

[40]

Ziad Obermeyer and Ezekiel J Emanuel. 2016. Predicting the future-big data, machine learning, and clinical medicine. The New England journal of medicine, Vol. 375, 13 (2016), 1216.

[41]

Judea Pearl, Madelyn Glymour, and Nicholas P Jewell. 2016. Causal inference in statistics: A primer. John Wiley & Sons.

[42]

Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. 2020. FACE: feasible and actionable counterfactual explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 344--350.

Digital Library

[43]

Kaivalya Rawal, Ece Kamar, and Himabindu Lakkaraju. 2020. Can I Still Trust You?: Understanding the Impact of Distribution Shifts on Algorithmic Recourses. arXiv preprint arXiv:2012.11788 (2020).

[44]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. " Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135--1144.

Digital Library

[45]

Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc., Vol. 100, 469 (2005), 322--331.

[46]

Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, Vol. 1, 5 (2019), 206--215.

[47]

Chris Russell. 2019. Efficient search for diverse coherent explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 20--28.

Digital Library

[48]

Maximilian Schleich, Zixuan Geng, Yihong Zhang, and Dan Suciu. 2021. GeCo: Quality Counterfactual Explanations in Real Time. arXiv preprint arXiv:2101.01292 (2021).

[49]

Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2019. Certifai: Counterfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models. arXiv preprint arXiv:1905.07857 (2019).

[50]

Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2020. Certifai: A common framework to provide explanations and analyse the fairness and robustness of black-box models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 166--172.

Digital Library

[51]

Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. In International Conference on Machine Learning. PMLR, 3145--3153.

[52]

Kacper Sokol and Peter Flach. 2020. Explainability fact sheets: a framework for systematic assessment of explainable approaches. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 56--67.

Digital Library

[53]

Mukund Sundararajan and Amir Najmi. 2020. The many Shapley values for model explanation. In International Conference on Machine Learning. PMLR, 9269--9278.

[54]

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International Conference on Machine Learning. PMLR, 3319--3328.

[55]

Sarah Tan, Rich Caruana, Giles Hooker, and Yin Lou. 2017. Detecting bias in black-box models using transparent model distillation. arXiv preprint arXiv:1710.06169 (2017).

[56]

Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 10--19.

Digital Library

[57]

Sahil Verma, John Dickerson, and Keegan Hines. 2020. Counterfactual Explanations for Machine Learning: A Review. arXiv preprint arXiv:2010.10596 (2020).

[58]

Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., Vol. 31 (2017), 841.

[59]

James Woodward. 2006. Sensitive and insensitive causation. The Philosophical Review, Vol. 115, 1 (2006), 1--50.

[60]

Chih-Kuan Yeh, Cheng-Yu Hsieh, Arun Sai Suggala, David I Inouye, and Pradeep Ravikumar. 2019. On the (in) fidelity and sensitivity for explanations. arXiv preprint arXiv:1901.09392 (2019).

[61]

Mo Yu, Shiyu Chang, Yang Zhang, and Tommi Jaakkola. 2019. Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 4085--4094.

[62]

Yujia Zhang, Kuangyan Song, Yiming Sun, Sarah Tan, and Madeleine Udell. 2019. " Why Should You Trust My Explanation?" Understanding Uncertainty in LIME Explanations. arXiv preprint arXiv:1904.12991 (2019).

Cited By

Moreira CChou YHsieh COuyang CPereira JJorge J(2024)Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black BoxACM Computing Surveys10.1145/3672553Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3672553
Jakubik JKwaśnicka HLi XHandl J(2024)Drawing Attributions From Evolved CounterfactualsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664122(1582-1589)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664122
Kommiya Mothilal RGuha SAhmed S(2024)Towards a Non-Ideal Methodological Framework for Responsible MLProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642501(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642501
Show More Cited By

Index Terms

Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End
1. Applied computing
  1. Law, social and behavioral sciences

Recommendations

Complexity results for explanations in the structural-model approach

We analyze the computational complexity of Halpern and Pearl's (causal) explanations in the structural-model approach, which are based on their notions of weak and actual cause. In particular, we give a precise picture of the complexity of deciding ...
Categorical and Continuous Features in Counterfactual Explanations of AI Systems
IUI '23: Proceedings of the 28th International Conference on Intelligent User Interfaces

Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The proponents of these algorithms claim they meet users’ ...
Categorical and Continuous Features in Counterfactual Explanations of AI Systems
Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

AIES '21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society

July 2021

1077 pages

ISBN:9781450384735

DOI:10.1145/3461702

Program Chairs:
Marion Fourcade
University of California Berkeley, USA
,
Benjamin Kuipers
University of Michigan, USA
,
Seth Lazar
Australian National University, Australia
,
Deirdre Mulligan
University of California Berkeley, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AIES '21

Sponsor:

SIGAI

AIES '21: AAAI/ACM Conference on AI, Ethics, and Society

May 19 - 21, 2021

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 61 of 162 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

37
Total Citations
View Citations
1,026
Total Downloads

Downloads (Last 12 months)300
Downloads (Last 6 weeks)32

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Moreira CChou YHsieh COuyang CPereira JJorge J(2024)Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black BoxACM Computing Surveys10.1145/3672553Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3672553
Jakubik JKwaśnicka HLi XHandl J(2024)Drawing Attributions From Evolved CounterfactualsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664122(1582-1589)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664122
Kommiya Mothilal RGuha SAhmed S(2024)Towards a Non-Ideal Methodological Framework for Responsible MLProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642501(1-17)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642501
Zhan JFang WLove PLuo H(2024)Explainable Artificial Intelligence: Counterfactual Explanations for Risk-Based Decision-Making in ConstructionIEEE Transactions on Engineering Management10.1109/TEM.2023.332595171(10667-10685)Online publication date: 2024
https://doi.org/10.1109/TEM.2023.3325951
Elkhawaga GElzeki OAbu-Elkheir MReichert M(2024)Why Should I Trust Your Explanation? An Evaluation Approach for XAI Methods Applied to Predictive Process Monitoring ResultsIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.33570415:4(1458-1472)Online publication date: Apr-2024
https://doi.org/10.1109/TAI.2024.3357041
Pawar UO'Shea DO'Reilly RCostello MBeder C(2024)Optimal Neighborhood Contexts in Explainable AI: An Explanandum-Based EvaluationIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33897815(181-194)Online publication date: 2024
https://doi.org/10.1109/OJCS.2024.3389781
Morales GSheppard J(2024)Counterfactual Analysis of Neural Networks Used to Create Fertilizer Management Zones2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650046(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650046
Chowdhury PPrabhushankar MAlRegib GDeriche M(2024)Are Objective Explanatory Evaluation Metrics Trustworthy? An Adversarial Analysis2024 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP51287.2024.10647779(3938-3944)Online publication date: 27-Oct-2024
https://doi.org/10.1109/ICIP51287.2024.10647779
Liu DYue X(2024)Counterfactual-Driven Model Explanation Evaluation Method2024 6th International Conference on Communications, Information System and Computer Engineering (CISCE)10.1109/CISCE62493.2024.10653060(471-475)Online publication date: 10-May-2024
https://doi.org/10.1109/CISCE62493.2024.10653060
Figueroa Barraza JLópez Droguett ERamos Martins M(2024)FS-SCF networkExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121670237:PCOnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121670
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents