[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3290605.3300809acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Public Access

Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models

Published: 02 May 2019 Publication History

Abstract

Without good models and the right tools to interpret them, data scientists risk making decisions based on hidden biases, spurious correlations, and false generalizations. This has led to a rallying cry for model interpretability. Yet the concept of interpretability remains nebulous, such that researchers and tool designers lack actionable guidelines for how to incorporate interpretability into models and accompanying tools. Through an iterative design process with expert machine learning researchers and practitioners, we designed a visual analytics system, Gamut, to explore how interactive interfaces could better support model interpretation. Using Gamut as a probe, we investigated why and how professional data scientists interpret models, and how interface affordances can support data scientists in answering questions about model interpretability. Our investigation showed that interpretability is not a monolithic concept: data scientists have different reasons to interpret models and tailor explanations for specific audiences, often balancing competing concerns of simplicity and completeness. Participants also asked to use Gamut in their work, highlighting its potential to help data scientists understand their own data.

Supplementary Material

MP4 File (paper579.mp4)
Supplemental video
MP4 File (paper579p.mp4)
Preview video

References

[1]
Ashraf Abdul, Jo Vermeulen, Danding Wang, Brian Y Lim, and Mohan Kankanhalli. 2018. Trends and trajectories for explainable, accountable and intelligible systems: an HCI research agenda. In ACM Conference on Human Factors in Computing Systems. ACM, 582.
[2]
Saleema Amershi, Max Chickering, Steven M Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. Modeltracker: redesigning performance analysis tools for machine learning. In ACM Conference on Human Factors in Computing Systems. ACM, 337--346.
[3]
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica, May 23 (2016).
[4]
Or Biran and Courtenay Cotton. 2017. Explanation and justification in machine learning: a survey. In IJCAI Workshop on Explainable AI.
[5]
Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D3 datadriven documents. IEEE Transactions on Visualization and Computer Graphics 12 (2011), 2301--2309.
[6]
Michael Brooks, Saleema Amershi, Bongshin Lee, Steven M Drucker, Ashish Kapoor, and Patrice Simard. 2015. FeatureInsight: visual support for error-driven feature ideation in text classification. In IEEE Conference on Visual Analytics Science and Technology. IEEE, 105--112.
[7]
Joy Buolamwini and Timnit Gebru. 2018. Gender shades: intersectional accuracy disparities in commercial gender classification. In ACM Conference on Fairness, Accountability and Transparency. 77--91.
[8]
Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183--186.
[9]
Mackinlay Card. 1999. Readings in information visualization: using vision to think. Morgan Kaufmann.
[10]
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission. In ACM International Conference on Knowledge Discovery and Data Mining. ACM, 1721--1730.
[11]
Jason Chuang, Daniel Ramage, Christopher Manning, and Jeffrey Heer. 2012. Interpretation and trust: designing model-driven visualizations for text analysis. In ACM Conference on Human Factors in Computing Systems. ACM, 443--452.
[12]
Dennis Collaris, Leo M Vink, and Jarke J van Wijk. 2018. Instancelevel explanations for fraud detection: a case study. ICML Workshop on Human Interpretability in Machine Learning (2018).
[13]
Kristin A Cook and James J Thomas. 2005. Illuminating the path: the research and development agenda for visual analytics. Technical Report. Pacific Northwest National Lab. Richland, WA, USA.
[14]
Joseph A Cruz and David S Wishart. 2006. Applications of machine learning in cancer prediction and prognosis. Cancer Informatics 2 (2006), 117693510600200030.
[15]
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).
[16]
Finale Doshi-Velez, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, David O'Brien, Stuart Schieber, James Waldo, David Weinberger, and Alexandra Wood. 2017. Accountability of AI under the law: the role of explanation. arXiv preprint arXiv:1711.01134 (2017).
[17]
Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics (2001), 1189--1232.
[18]
Bill Gaver, Tony Dunne, and Elena Pacenti. 1999. Design: cultural probes. Interactions 6, 1 (1999), 21--29.
[19]
Marco Gillies, Rebecca Fiebrink, Atau Tanaka, Jérémie Garcia, Frédéric Bevilacqua, Alexis Heloir, Fabrizio Nunnari, Wendy Mackay, Saleema Amershi, Bongshin Lee, et al. 2016. Human-centred machine learning. In ACM Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 3558--3565.
[20]
Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: an approach to evaluating interpretability of machine learning. arXiv preprint arXiv:1806.00069 (2018).
[21]
Bryce Goodman and Seth Flaxman. 2016. European Union regulations on algorithmic decision-making and a "right to explanation". ICML Workshop on Human Interpretability in Machine Learning (2016).
[22]
Connor Graham and Mark Rouncefield. 2008. Probes and participation. In Conference on Participatory Design. Indiana University, 194--197.
[23]
David Gunning. 2017. Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA) (2017).
[24]
Trevor J Hastie and Robert Tibshirani. 1990. Generalized additive models. In Chapman & Hall/CRC.
[25]
Fred Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. 2018. Visual analytics in deep learning: an interrogative survey for the next frontiers. IEEE Transactions on Visualization and Computer Graphics (2018).
[26]
Hilary Hutchinson, Wendy Mackay, Bo Westerlund, Benjamin B Bederson, Allison Druin, Catherine Plaisant, Michel Beaudouin-Lafon, Stéphane Conversy, Helen Evans, Heiko Hansen, et al. 2003. Technology probes: inspiring design for and with families. In ACM Conference on Human Factors in Computing Systems. ACM, 17--24.
[27]
Neal Jean, Marshall Burke, Michael Xie, W Matthew Davis, David B Lobell, and Stefano Ermon. 2016. Combining satellite imagery and machine learning to predict poverty. Science 353, 6301 (2016), 790--794.
[28]
Kelvyn Jones and Simon Almond. 1992. Moving out of the linear rut: the possibilities of generalized additive models. Transactions of the Institute of British Geographers (1992), 434--447.
[29]
Michael I Jordan and Tom M Mitchell. 2015. Machine learning: trends, perspectives, and prospects. Science 349, 6245 (2015), 255--260.
[30]
Minsuk Kahng, Pierre Y Andrews, Aditya Kalro, and Duen Horng Polo Chau. 2018. Activis: Visual exploration of industry-scale deep neural network models. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 88--97.
[31]
Minsuk Kahng, Dezhi Fang, and Duen Horng Polo Chau. 2016. Visual exploration of machine learning results using data cube analysis. In Workshop on Human-In-the-Loop Data Analytics. ACM.
[32]
Konstantina Kourou, Themis P Exarchos, Konstantinos P Exarchos, Michalis V Karamouzis, and Dimitrios I Fotiadis. 2015. Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal 13 (2015), 8--17.
[33]
Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, and Enrico Bertini. 2017. A workflow for isual diagnostics of binary classifiers using instance-level explanations. IEEE Conference on Visual Analytics Science and Technology (2017).
[34]
Josua Krause, Adam Perer, and Enrico Bertini. 2014. INFUSE: interactive feature selection for predictive modeling of high dimensional data. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 1614--1623.
[35]
Josua Krause, Adam Perer, and Enrico Bertini. 2018. A user study on the effect of aggregating explanations for interpreting machine learning models. ACM KDD Workshop on Interactive Data Exploration and Analytics (2018).
[36]
Josua Krause, Adam Perer, and Kenney Ng. 2016. Interacting with predictions: visual inspection of black-box machine learning models. In ACM Conference on Human Factors in Computing Systems. ACM, 5686--5697.
[37]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.
[38]
Zachary C Lipton. 2016. The mythos of model interpretability. ICML Workshop on Human Interpretability in Machine Learning (2016).
[39]
Yin Lou, Rich Caruana, and Johannes Gehrke. 2012. Intelligible models for classification and regression. In ACM International Conference on Knowledge Discovery and Data Mining. ACM, 150--158.
[40]
Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. 2013. Accurate intelligible models with pairwise interactions. In ACM International Conference on Knowledge Discovery and Data Mining. ACM, 623--631.
[41]
Junhua Lu, Wei Chen, Yuxin Ma, Junming Ke, Zongzhuang Li, Fan Zhang, and Ross Maciejewski. 2017. Recent progress and trends in predictive visual analytics. Frontiers of Computer Science 11, 2 (2017), 192--207.
[42]
Yafeng Lu, Rolando Garcia, Brett Hansen, Michael Gleicher, and Ross Maciejewski. 2017. The state-of-the-art in predictive visual analytics. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 539--562.
[43]
Michael Madaio, Shang-Tse Chen, Oliver L Haimson, Wenwen Zhang, Xiang Cheng, Matthew Hinds-Aldrich, Duen Horng Chau, and Bistra Dilkina. 2016. Firebird: predicting fire risk and prioritizing fire inspections in atlanta. In ACM International Conference on Knowledge Discovery and Data Mining. ACM, 185--194.
[44]
Sean McGregor, Hailey Buckingham, Thomas G Dietterich, Rachel Houtman, Claire Montgomery, and Ronald Metoyer. 2017. Interactive visualization for testing markov decision processes: MDPVIS. Journal of Visual Languages & Computing 39 (2017), 93--106.
[45]
Tim Miller. 2017. Explanation in artificial intelligence: insights from the social sciences. arXiv preprint arXiv:1706.07269 (2017).
[46]
Yao Ming, Huamin Qu, and Enrico Bertini. 2019. RuleMatrix: visualizing and understanding classifiers with rules. IEEE Transactions on Visualization and Computer Graphics 25, 1 (2019), 342--352.
[47]
Christoph Molnar. 2018. Interpretable machine learning. https://christophm.github.io/interpretable-ml-book/. https://christophm.github.io/interpretable-ml-book/.
[48]
Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. 2017. Methods for interpreting and understanding deep neural networks. Digital Signal Processing (2017).
[49]
Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do humans understand explanations from machine learning systems? an evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1802.00682 (2018).
[50]
Google PAIR. 2018. What-If Tool. (2018). https://pair-code.github.io/ what-if-tool/
[51]
Parliament and Council of the European Union. 2016. General Data Protection Regulation. (2016).
[52]
Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2017. Manipulating and measuring model interpretability. NIPS Women in Machine Learning Workshop (2017).
[53]
Donghao Ren, Saleema Amershi, Bongshin Lee, Jina Suh, and Jason D Williams. 2017. Squares: supporting interactive performance analysis for multiclass classifiers. IEEE Transactions on Visualization and Computer Graphics 23, 1 (2017), 61--70.
[54]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: explaining the predictions of any classifier. In ACM International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.
[55]
Dominik Sacha, Michael Sedlmair, Leishi Zhang, John A Lee, Jaakko Peltonen, Daniel Weiskopf, Stephen C North, and Daniel A Keim. 2017. What you see is what you can change: human-centered machine learning by interactive visualization. Neurocomputing 268 (2017), 164-- 175.
[56]
Matthias Schmid and Torsten Hothorn. 2008. Boosting additive models using component-wise P-splines. Computational Statistics & Data Analysis 53, 2 (2008), 298--311.
[57]
Daniel Servén and Charlie Brummitt. 2018. pyGAM: generalized additive models in python.
[58]
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017. Mastering the game of Go without human knowledge. Nature 550, 7676 (2017), 354.
[59]
Bhavkaran Singh Walia, Qianyi Hu, Jeffrey Chen, Fangyan Chen, Jessica Lee, Nathan Kuo, Palak Narang, Jason Batts, Geoffrey Arnold, and Michael Madaio. 2018. A dynamic pipeline for spatio-temporal fire risk prediction. In ACM International Conference on Knowledge Discovery & Data Mining. ACM, 764--773.
[60]
Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: three experiments. International Journal of Human-Computer Studies 67, 8 (2009), 639--662.
[61]
Sarah Tan, Rich Caruana, Giles Hooker, and Yin Lou. 2018. Distilland-compare: auditing black-box models using transparent model distillation. AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (2018).
[62]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without ppening the black box: automated decisions and the GDPR. arXiv preprint arXiv:1711.00399 (2017).
[63]
Daniel S. Weld and Gagan Bansal. 2018. Intelligible artificial intelligence. arXiv preprint arXiv:1803.04263 (2018).
[64]
Simon N Wood. 2006. Generalized additive models: an introduction with R. Chapman and Hall/CRC.
[65]
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google's neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).
[66]
Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. 2018. Grounding interactive machine learning tool design in how nonexperts actually build models. In Designing Interactive Systems Conference. ACM, 573--584.
[67]
Jiawei Zhang, Yang Wang, Piero Molino, Lezhi Li, and David S Ebert. 2019. Manifold: a model-agnostic framework for interpretation and diagnosis of machine learning Models. IEEE Transactions on Visualization and Computer Graphics 25, 1 (2019), 364--373.

Cited By

View all
  • (2024)Slide to Explore 'What If': An Analysis of Explainable InterfacesAdjunct Proceedings of the 2024 Nordic Conference on Human-Computer Interaction10.1145/3677045.3685416(1-6)Online publication date: 13-Oct-2024
  • (2024)Tyche: Making Sense of PBT EffectivenessProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676407(1-16)Online publication date: 13-Oct-2024
  • (2024)VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision-MakingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676323(1-21)Online publication date: 13-Oct-2024
  • Show More Cited By

Index Terms

  1. Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
        May 2019
        9077 pages
        ISBN:9781450359702
        DOI:10.1145/3290605
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 02 May 2019

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. data visualization
        2. design probe
        3. interactive interfaces
        4. machine learning interpretability
        5. visual analytics

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        CHI '19
        Sponsor:

        Acceptance Rates

        CHI '19 Paper Acceptance Rate 703 of 2,958 submissions, 24%;
        Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

        Upcoming Conference

        CHI 2025
        ACM CHI Conference on Human Factors in Computing Systems
        April 26 - May 1, 2025
        Yokohama , Japan

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)970
        • Downloads (Last 6 weeks)71
        Reflects downloads up to 23 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Slide to Explore 'What If': An Analysis of Explainable InterfacesAdjunct Proceedings of the 2024 Nordic Conference on Human-Computer Interaction10.1145/3677045.3685416(1-6)Online publication date: 13-Oct-2024
        • (2024)Tyche: Making Sense of PBT EffectivenessProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676407(1-16)Online publication date: 13-Oct-2024
        • (2024)VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision-MakingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676323(1-21)Online publication date: 13-Oct-2024
        • (2024)Responding to Generative AI Technologies with Research-through-Design: The Ryelands AI Lab as an Exploratory StudyProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660677(1823-1841)Online publication date: 1-Jul-2024
        • (2024)Interpretability Gone Bad: The Role of Bounded Rationality in How Practitioners Understand Machine LearningProceedings of the ACM on Human-Computer Interaction10.1145/36373548:CSCW1(1-34)Online publication date: 26-Apr-2024
        • (2024)"If it is easy to understand then it will have value": Examining Perceptions of Explainable AI with Community Health Workers in Rural IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36373488:CSCW1(1-28)Online publication date: 26-Apr-2024
        • (2024)Regulating Explainability in Machine Learning Applications -- Observations from a Policy Design ExperimentProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659028(2101-2112)Online publication date: 3-Jun-2024
        • (2024)MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification ModelsProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659011(1861-1874)Online publication date: 3-Jun-2024
        • (2024)A Critical Survey on Fairness Benefits of Explainable AIProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658990(1579-1595)Online publication date: 3-Jun-2024
        • (2024)Explorable Explainable AI: Improving AI Understanding for Community Health Workers in IndiaProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642733(1-21)Online publication date: 11-May-2024
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media