[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3600211.3604656acmconferencesArticle/Chapter ViewAbstractPublication PagesaiesConference Proceedingsconference-collections
research-article
Open access

Iterative Partial Fulfillment of Counterfactual Explanations: Benefits and Risks

Published: 29 August 2023 Publication History

Abstract

Counterfactual (CF) explanations, also known as contrastive explanations and algorithmic recourses, are popular for explaining machine learning models in high-stakes domains. For a subject that receives a negative model prediction (e.g., mortgage application denial), the CF explanations are similar instances but with positive predictions, which informs the subject of ways to improve. While their various properties have been studied, such as validity and stability, we contribute a novel one: their behaviors under iterative partial fulfillment (IPF). Specifically, upon receiving a CF explanation, the subject may only partially fulfill it before requesting a new prediction with a new explanation, and repeat until the prediction is positive. Such partial fulfillment could be due to the subject’s limited capability (e.g., can only pay down two out of four credit card accounts at this moment) or an attempt to take the chance (e.g., betting that a monthly salary increase of $800 is enough even though $1,000 is recommended). Does such iterative partial fulfillment increase or decrease the total cost of improvement incurred by the subject? We mathematically formalize IPF and demonstrate, both theoretically and empirically, that different CF algorithms exhibit vastly different behaviors under IPF. We discuss implications of our observations, advocate for this factor to be carefully considered in the development and study of CF algorithms, and give several directions for future work.

References

[1]
Julius Adebayo, Michael Muelly, Harold Abelson, and Been Kim. 2022. Post Hoc Explanations May Be Ineffective for Detecting Unknown Spurious Correlation. In International Conference on Learning Representations (ICLR).
[2]
Saugat Aryal and Mark T Keane. 2023. Even if Explanations: Prior Work, Desiderata & Benchmarks for Semi-Factual XAI. arXiv:2301.11970 (2023).
[3]
Ruth MJ Byrne. 2019. Counterfactuals in Explainable Artificial Intelligence (XAI): Evidence from Human Reasoning. In International Joint Conference on Artificial Intelligence (IJCAI). 6276–6282.
[4]
Yatong Chen, Jialu Wang, and Yang Liu. 2020. Linear Classifiers That Encourage Constructive Adaptation. arXiv:2011.00355 (2020).
[5]
Jessica Dai, Sohini Upadhyay, Ulrich Aivodji, Stephen H. Bach, and Himabindu Lakkaraju. 2022. Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post Hoc Explanations. In AAAI/ACM Conference on AI, Ethics, and Society (AIES). Association for Computing Machinery, 203–214.
[6]
Lucas De Lara, Alberto González-Sanz, Nicholas Asher, and Jean-Michel Loubes. 2021. Transport-Based Counterfactual Models. arXiv:2108.13025 (2021).
[7]
Ricardo Dominguez-Olmedo, Amir H Karimi, and Bernhard Schölkopf. 2022. On the Adversarial Robustness of Causal Algorithmic Recourse. In International Conference on Machine Learning (ICML). PMLR, 5324–5342.
[8]
Dheeru Dua and Casey Graff. 1994. UCI Statlog (German Credit Data) Data Set. UCI Meachine Learning Repository (1994).
[9]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness Through Awareness. In Innovations in Theoretical Computer Science (ITCS). 214–226.
[10]
Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and Harnessing Adversarial Examples. arXiv:1412.6572 (2014).
[11]
Riccardo Guidotti. 2022. Counterfactual Explanations and How to Find Them: Literature Review and Benchmarking. Data Mining and Knowledge Discovery (2022), 1–55.
[12]
Vivek Gupta, Pegah Nokhiz, Chitradeep Dutta Roy, and Suresh Venkatasubramanian. 2019. Equalizing Recourse Across Groups. arXiv:1909.03166 (2019).
[13]
Alistair EW Johnson, Tom J Pollard, Lu Shen, Li-wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a Freely Accessible Critical Care Database. Scientific Data 3, 1 (2016), 1–9.
[14]
Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-Aware Learning Through Regularization Approach. In IEEE International Conference on Data Mining (ICDM) Workshops. IEEE, 643–650.
[15]
Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike, Kento Uemura, and Hiroki Arimura. 2021. Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization. In AAAI Conference on Artificial Intelligence (AAAI), Vol. 35. 11564–11574.
[16]
Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera. 2020. Model-Agnostic Counterfactual Explanations for Consequential Decisions. In International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 895–905.
[17]
Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. 2021. Algorithmic Recourse: From Counterfactual Explanations to Interventions. In ACM Conference on Fairness, Accountability, and Transparency (FAccT). 353–362.
[18]
Mark T Keane, Eoin M Kenny, Eoin Delaney, and Barry Smyth. 2021. If Only We Had Better Counterfactual Explanations: Five Key Deficits to Rectify in the Evaluation of Counterfactual XAI Techniques. In International Joint Conference on Artificial Intelligence (IJCAI).
[19]
Eoin M Kenny and Mark T Keane. 2021. On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning. In AAAI Conference on Artificial Intelligence (AAAI), Vol. 35. 11575–11585.
[20]
Ronny Kohavi and Barry Becker. 1996. UCI Adult Data Set. UCI Meachine Learning Repository (1996).
[21]
Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How We Analyzed the COMPAS Recidivism Algorithm. ProPublica 9, 1 (2016), 3–3.
[22]
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017. Delving into Transferable Adversarial Examples and Black-Box Attacks. In International Conference on Learning Representations (ICLR).
[23]
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems (NIPS). 4765–4774.
[24]
Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN as an Implementation of Situation Testing for Discrimination Discovery and Prevention. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 502–510.
[25]
Donato Maragno, Jannis Kurtz, Tabea E Röber, Rob Goedhart, Ş Ilker Birbil, and Dick den Hertog. 2023. Finding Regions of Counterfactual Explanations via Robust Optimization. arXiv:2301.11113 (2023).
[26]
Tim Miller. 2019. Explanation in Artificial Intelligence: Insights from the Social Sciences. Artificial Intelligence 267 (2019), 1–38.
[27]
Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining Machine Learning Classifiers Through Diverse Counterfactual Explanations. In ACM Conference on Fairness, Accountability, and Transparency (FAT*). 607–617.
[28]
Philip Naumann and Eirini Ntoutsi. 2021. Consequence-Aware Sequential Counterfactual Generation. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD). Springer, 682–698.
[29]
Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. 2020. Learning Model-Agnostic Counterfactual Explanations for Tabular Data. In The World Wide Web Conference (WebConf). 3126–3132.
[30]
Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. 2020. FACE: Feasible and Actionable Counterfactual Explanations. In AAAI/ACM Conference on AI, Ethics, and Society (AIES). 344–350.
[31]
Goutham Ramakrishnan, Yun Chan Lee, and Aws Albarghouthi. 2020. Synthesizing Action Sequences for Modifying Model Decisions. In AAAI Conference on Artificial Intelligence (AAAI), Vol. 34. 5462–5469.
[32]
Kaivalya Rawal, Ece Kamar, and Himabindu Lakkaraju. 2020. Algorithmic Recourse in the Wild: Understanding the Impact of Data and Model Shifts. arXiv:2012.11788 (2020).
[33]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?" Explaining the Predictions of Any Classifier. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
[34]
Chris Russell. 2019. Efficient Search for Diverse Coherent Explanations. In ACM Conference on Fairness, Accountability, and Transparency (FAT*). 20–28.
[35]
Maximilian Schleich, Zixuan Geng, Yihong Zhang, and Dan Suciu. 2021. GeCo: Quality Counterfactual Explanations in Real Time. Proceedings of the VLDB Endowment 14, 9 (May 2021), 1681–1693.
[36]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arXiv:1312.6034 (2013).
[37]
Dylan Slack, Anna Hilgard, Himabindu Lakkaraju, and Sameer Singh. 2021. Counterfactual Explanations Can Be Manipulated. In Advances in Neural Information Processing Systems (NeurIPS), Vol. 34. 62–75.
[38]
Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. Fooling LIME and SHAP: Adversarial Attacks on Post Hoc Explanation Methods. In AAAI/ACM Conference on AI, Ethics, and Society (AIES). Association for Computing Machinery, 180–186.
[39]
Sohini Upadhyay, Shalmali Joshi, and Himabindu Lakkaraju. 2021. Towards Robust and Reliable Algorithmic Recourse. In Advances in Neural Information Processing Systems (NeurIPS), Vol. 34. 16926–16937.
[40]
Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable Recourse in Linear Classification. In ACM Conference on Fairness, Accountability, and Transparency (FAT*). 10–19.
[41]
Arnaud Van Looveren and Janis Klaise. 2021. Interpretable Counterfactual Explanations Guided by Prototypes. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD). Springer, 650–665.
[42]
Sahil Verma, John Dickerson, and Keegan Hines. 2020. Counterfactual Explanations for Machine Learning: A Review. arXiv:2010.10596 (2020).
[43]
Sahil Verma, Keegan Hines, and John P Dickerson. 2022. Amortized Generation of Sequential Algorithmic Recourses for Black-Box Models. In AAAI Conference on Artificial Intelligence (AAAI), Vol. 36. 8512–8519.
[44]
Marco Virgolin and Saverio Fracaros. 2023. On the Robustness of Sparse Counterfactual Explanations to Adverse Perturbations. Artificial Intelligence 316 (2023).
[45]
Julius von Kügelgen, Amir-Hossein Karimi, Umang Bhatt, Isabel Valera, Adrian Weller, and Bernhard Schölkopf. 2022. On the Fairness of Causal Algorithmic Recourse. In AAAI Conference on Artificial Intelligence (AAAI), Vol. 36. 9584–9594.
[46]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and The GDPR. Harvard Journal of Law & Technology 31 (2017), 841.
[47]
Mengjiao Yang and Been Kim. 2019. Benchmarking Attribution Methods with Relative Feature Importance. arXiv:1907.09701 (2019).
[48]
Yilun Zhou, Serena Booth, Marco Tulio Ribeiro, and Julie Shah. 2022. Do Feature Attribution Methods Correctly Attribute Features?. In AAAI Conference on Artificial Intelligence (AAAI).

Index Terms

  1. Iterative Partial Fulfillment of Counterfactual Explanations: Benefits and Risks

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        AIES '23: Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society
        August 2023
        1026 pages
        ISBN:9798400702310
        DOI:10.1145/3600211
        This work is licensed under a Creative Commons Attribution International 4.0 License.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 29 August 2023

        Check for updates

        Author Tags

        1. counterfactual explanation
        2. interpretability
        3. societal impacts of AI

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        AIES '23
        Sponsor:
        AIES '23: AAAI/ACM Conference on AI, Ethics, and Society
        August 8 - 10, 2023
        QC, Montr\'{e}al, Canada

        Acceptance Rates

        Overall Acceptance Rate 61 of 162 submissions, 38%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 254
          Total Downloads
        • Downloads (Last 12 months)218
        • Downloads (Last 6 weeks)25
        Reflects downloads up to 21 Dec 2024

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media