[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

My Model is Unfair, Do People Even Care? Visual Design Affects Trust and Perceived Bias in Machine Learning

Published: 25 October 2023 Publication History

Abstract

Machine learning technology has become ubiquitous, but, unfortunately, often exhibits bias. As a consequence, disparate stakeholders need to interact with and make informed decisions about using machine learning models in everyday systems. Visualization technology can support stakeholders in understanding and evaluating trade-offs between, for example, accuracy and fairness of models. This paper aims to empirically answer “Can visualization design choices affect a stakeholder's perception of model bias, trust in a model, and willingness to adopt a model?” Through a series of controlled, crowd-sourced experiments with more than 1,500 participants, we identify a set of strategies people follow in deciding which models to trust. Our results show that men and women prioritize fairness and performance differently and that visual design choices significantly affect that prioritization. For example, women trust fairer models more often than men do, participants value fairness more when it is explained using text than as a bar chart, and being explicitly told a model is biased has a bigger impact than showing past biased performance. We test the generalizability of our results by comparing the effect of multiple textual and visual design choices and offer potential explanations of the cognitive mechanisms behind the difference in fairness perception and trust. Our research guides design considerations to support future work developing visualization systems for machine learning.

References

[1]
Supplementary materials for My Model is Unfair, Do People Even Care? Visual Design Affects Trust and Perceived Bias in Machine Learning. [online]. available: https://osf.io/er5a3/?view_only=cda4c6acfd684da287225c8124fb7b9e, 2023. 3, 4, 5, 6.
[2]
J. Ahmad, E. Huynh, and F. Chevalier. When red means good, bad, or canada: Exploring people's reasoning for choosing color palettes. In IEEE Visualization Conference, 2021. 6.
[3]
Y. Ahn and Y.-R. Lin. Fairsight: Visual analytics for fairness in decision making. IEEE Transactions on Visualization and Computer Graphics, 26 (1): pp. 1086–1095, 2020. 2.
[4]
J. Angwin, J. Larson, S. Mattu, and L. Kirchner. Machine bias. ProPublica, May 23, 2016.[online]. available: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. 1.
[5]
A. Bell, L. Bynum, N. Drushchak, T. Herasymova, L. Rosenblatt, and J. Stoyanovich. The possibility of fairness: Revisiting the impossibility theorem in practice, 2023. 1.
[6]
R. K. E. Bellamy et al. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. CoRR, 1810.01943, 2018. 2.
[7]
J. Berg, J. Dickhaut, and K. McCabe. Trust, reciprocity, and social history. Games and Economic Behavior, 10 (1): pp. 122–142, 1995. 2, 9.
[8]
S. Bird et al. Fairlearn: A toolkit for assessing and improving fairness in AI. Technical Report MSR-TR-2020-32, Microsoft, 2020. 2.
[9]
M. A. Borkin et al. Beyond memorability: Visualization recognition and recall. IEEE Transactions on Visualization and Computer Graphics, 22 (1): pp. 519–528, 2015. 8.
[10]
V. Braun and V. Clarke. Using thematic analysis in psychology. Qualitative Research in Psychology, 3 (2): pp. 77–101, 2006. 5.
[11]
V. Braun and V. Clarke. Thematic analysis., pp. 57–71. 01 2012. 5.
[12]
M. Brülhart and J.-C. Usunier. Does the trust game measure trust? Economics Letters, 115 (1): pp. 20–23, 2012. 2.
[13]
J. Buolamwini and T. Gebru. Gender Shades: Intersectional accuracy disparities in commercial gender classification. In FAT*, pp. 77–91, 2018. 1.
[14]
A. A. Cabrera et al. FairVis: Visual analytics for discovering intersectional bias in machine learning. In VAST, pp. 46–56. IEEE, 2019. 1, 2.
[15]
C. S. Carver and T. L. White. Behavioral inhibition, behavioral activation, and affective responses to impending reward and punishment: The bis/bas scales. Journal of Personality and Social Psychology, 62 (2): pp. 319–333, 1994. 3.
[16]
A. Chatzimparmpas, R. M. Martins, I. Jusufi, and A. Kerren. A survey of surveys on the use of visualization for interpreting machine learning models. Information Visualization, 19 (3): pp. 207–233, 2020. 2.
[17]
J. Coleman and American Council of Learned Societies. Foundations of Social Theory. ACLS Humanities E-Book. Belknap Press of Harvard University Press, 1990. 1, 2.
[18]
J. Dang, K. M. King, and M. Inzlicht. Why are self-report and behavioral measures weakly correlated? Trends in Cognitive Sciences, 24 (4): pp. 267–269, 2020. 2.
[19]
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In ITCS, pp. 214–226, August 2012. 1, 2.
[20]
H. Elhamdadi, L. Padilla, and C. Xiong. Processing fluency improves trust in scatterplot visualizations. in Workshop on TRust and EXpertise in Visualization (TREX), 2022. 1, 2.
[21]
M. Fernandes, L. Walls, S. Munson, J. Hullman, and M. Kay. Uncertainty displays using quantile dotplots or cdfs improve transit decision-making. In CHI, pp. 1–12, 2018. 6.
[22]
S. Frederick. Cognitive reflection and decision making. Journal of Economic Perspectives, 19 (4): pp. 25–42, 2005. 3.
[23]
S. A. Friedler, C. Scheidegger, and S. Venkatasubramanian. On the (im)possibility of fairness. CoRR, abs/1609.07236, 2016. 2.
[24]
A. Gaba, V. Setlur, A. Srinivasan, J. Hoffswell, and C. Xiong. Comparison conundrum and the chamber of visualizations: An exploration of how language influences visual design. IEEE Transactions on Visualization and Computer Graphics, 29 (1): pp. 1211–1221, 2022. 1.
[25]
S. Galhotra, Y. Brun, and A. Meliou. Fairness testing: Testing software for discrimination. In ESEC/FSE, 2017. 1, 2, 9.
[26]
B. Ghai and K. Mueller. D-bias: A causality-based human-in-the-loop system for tackling algorithmic bias. IEEE Transactions on Visualization and Computer Graphics, 29 (1), 2023. 2.
[27]
S. Giguere, B. Metevier, Y. Brun, B. C. da Silva, P. S. Thomas, and S. Niekum. Fairness guarantees under demographic shift. In ICLR, 2022. 1.
[28]
E. L. Glaeser, D. I. Laibson, J. A. Scheinkman, and C. L. Soutter. Measuring trust. The Quarterly Journal of Economics, 115 (3): pp. 811–846, 2000. 1.
[29]
D. G. Goldstein. Leveling up applied behavioral economics. The Behavioral Economics Guide, pp. VI–XI, 2022. 8.
[30]
R. Goodland and G. Ledec. Neoclassical economics and principles of sustainable development. Ecological Modelling, 38 (1): pp. 19–46, 1987. Ecological Economics. 9.
[31]
K. W. Hall, A. Kouroupis, A. Bezerianos, D. A. Szafir, and C. Collins. Professional differences: A comparative study of visualization task performance and spatial ability across disciplines. IEEE Transactions on Visualization and Computer Graphics, 28 (1): pp. 654–664, 2022. 9.
[32]
G. Harrison, J. Hanson, C. Jacinto, J. Ramirez, and B. Ur. An empirical study on the perceived fairness of realistic, imperfect machine learning models. In FAT*, pp. 392–402, 2020. 2, 9.
[33]
U. Hoffrage, S. Lindsey, R. Hertwig, and G. Gigerenzer. Communicating statistical information, 2000. 6.
[34]
F. Hohman, M. Kahng, R. Pienta, and D. H. Chau. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE Transactions on Visualization and Computer Graphics, 25 (8): pp. 2674–2693, 2019. 2.
[35]
E. Holder and C. Xiong. Dispersion vs disparity: Hiding variability can encourage stereotyping when visualizing social outcomes. IEEE Transactions on Visualization and Computer Graphics, 29 (1): pp. 624–634, 2022. 1.
[36]
B. Johnson, J. Bartola, R. Angell, K. Keith, S. Witty, S. J. Giguere, and Y. Brun. Fairkit, fairkit, on the wall, who's the fairest of them all? Supporting data scientists in training fair models. EURO Journal on Decision Processes, 2023. 1, 2.
[37]
J. F. H. Jr., W. C. Black, B. J. Babin, and R. E. Anderson. Multivariate Data Analysis. Cengage Learning EMEA, 2019. 4.
[38]
F. A. Khan and J. Stoyanovich. The unbearable weight of massive privilege: Revisiting bias-variance trade-offs in the context of fair prediction. CoRR, abs/2302.08704, 2023. 9.
[39]
D. H. Kim, V. Setlur, and M. Agrawala. Towards understanding how readers integrate charts and captions: A case study with line charts. In CHI, 2021. 8.
[40]
S. S. Komorita. Social dilemmas. Routledge, 2019. 2, 9.
[41]
M. Komorowski, L. A. Celi, O. Badawi, A. C. Gordon, and A. A. Faisal. The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nature Medicine, 24 (11): pp. 1716–1720, 2018. 1.
[42]
H.-K. Kong, Z. Liu, and K. Karahalios. Frames and slants in titles of visualizations on controversial topics. In CHI, pp. 1–12, 2018. 8.
[43]
H.-K. Kong, Z. Liu, and K. Karahalios. Trust and recall of information across varying degrees of title-visualization misalignment. In CHI, pp. 1–13, 2019. 8.
[44]
D. Kreps. Corporate culture and economic theory. Perspectives on Positive Political Economy, 1990. 2, 9.
[45]
V. Lai, C. Chen, Q. V. Liao, A. Smith-Renner, and C. Tan. Towards a science of human-AI decision making: A survey of empirical studies. CoRR, abs/2112.11471, 2021. 1.
[46]
S. Lin, J. Fortuna, C. Kulkarni, M. Stone, and J. Heer. “Selecting semantically-resonant colors for data visualization.” In Computer Graphics Forum, vol. 32, pp. 401–410. Wiley Online Library, 2013. 6.
[47]
E. Mayr, N. Hynek, S. Salisu, and F. Windhager. Trust in information visualization. In EuroVis Workshop on Trustworthy Visualization (TrustVis). The Eurographics Association, 2019. 1.
[48]
B. Metevier, S. Giguere, S. Brockman, A. Kobren, Y. Brun, E. Brunskill, and P. S. Thomas. Offline contextual bandits with high probability fairness guarantees. in Advances in Neural Information Processing Systems (NeurIPS), 32: pp. 14893–14904, December 2019. 1.
[49]
D. Munechika, Z. J. Wang, J. Reidy, J. Rubin, K. Gade, K. Kenthapadi, and D. H. Chau. Visual auditor: Interactive visualization for detection and summarization of model biases. In IEEE Visualization and Visual Analytics (VIS), pp. 45–49, 2022. 2,9.
[50]
B. Ondov, N. Jardine, N. Elmqvist, and S. Franconeri. Face to face: Evaluating visual comparison. IEEE Transactions on Visualization and Computer Graphics, 25 (1): pp. 861–871, 2018. 6.
[51]
J. Ooge, S. Kato, and K. Verbert. Explaining recommendations in e-learning: Effects on adolescents' trust. In IUI, pp. 93–105. ACM, 2022. 2.
[52]
S. Otto and S. Weinzierl. Comparative simulations of adaptive psychometric procedures. Jahrestagung der Deutschen Gesellschaft fü r Akustik, pp. 1276–1279, 2009. 3.
[53]
L. Padilla, R. Fygenson, S. C. Castro, and E. Bertini. Multiple forecast visualizations (MFVs): Trade-offs in trust and performance in multiple COVID-19 forecast visualizations. IEEE Transactions on Visualization and Computer Graphics, 29 (1): pp. 12–22, 2022. 1.
[54]
S. Palan and C. Schitter. Prolific.ac — A subject pool for online experiments. Journal of Behavioral and Experimental Finance, 17: pp. 22–27, 2018. 3, 4, 7.
[55]
A. Papenmeier, D. Kern, G. Englebienne, and C. Seifert. It's complicated: The relationship between user trust, model accuracy and explanations in ai. in ACM TOCHI, 29 (4), mar 2022. 2.
[56]
D. L. Paulhus, S. Vazire et al., “The self-report method.” in Handbook of research methods in personality psychology, 1(2007): pp. 224–239, 2007. 2.
[57]
E. M. Peck et al. Data is personal: Attitudes and perceptions of data visualization in rural Pennsylvania. In CHI, pp. 1–12, 2019. 3, 5.
[58]
M. Pielot et al. Beyond interruptibility: Predicting opportune moments to engage mobile phone users. Interactive, Mobile, Wearable and Ubiquitous Technologies, 1 (3): pp. 1–25, 2017. 8.
[59]
F. Poursabzi-Sangdeh, D. G. Goldstein, J. M. Hofman, J. W. Wort-man Vaughan, and H. Wallach. Manipulating and measuring model interpretability. In CHI, 2021. 2.
[60]
Q. Ai. How intelligent machines are reshaping investing. Forbes, 2022. 1.
[61]
I. Qualtrics. Qualtrics. Provo, UT, USA, 2013. 3.
[62]
R. A. Rensink and G. Baldridge. The perception of correlation in scatter-plots. In Computer Graphics Forum, vol. 29, pp. 1203–1210, 2010. 3.
[63]
C. Ross. IBM's Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments, internal documents show. in STAT+, July 2018. 1.
[64]
J. B. Rotter. Interpersonal trust, trustworthiness, and gullibility. American Psychologist, 35 (1): p. 1, 1980. 3.
[65]
P. K. Roy, S. S. Chowdhary, and R. Bhatia. A machine learning approach for automation of resume recommendation system. Procedia Computer Science, 167: pp. 2318–2327, 2020. 1.
[66]
D. Sacha, H. Senaratne, B. C. Kwon, G. Ellis, and D. A. Keim. The role of uncertainty, awareness, and trust in visual analytics. IEEE Transactions on Visualization and Computer Graphics, 22 (1): pp. 240–249, 2016. 2.
[67]
K. B. Schloss, Z. Leggon, and L. Lessard. Semantic discriminability for visual communication. IEEE Transactions on Visualization and Computer Graphics, 27 (2): pp. 1022–1031, 2020. 6.
[68]
D. O. Sears and P. J. Henry. Over thirty years later: A contemporary look at symbolic racism. Advances in experimental social psychology, 37 (1): pp. 95–125, 2005. 8.
[69]
N. Sheard and A. Schwartz. The movement to ban government use of face recognition. [online]. available: https://www.eff.org/deeplinks/2022/05/movement-ban-government-use-face-recognition, May 2022. 1.
[70]
N. Singer. Amazon faces investor pressure over facial recognition. New York Times, May 2019. 1.
[71]
T. H. Stark et al. The impact of social desirability pressures on Whites' endorsement of racial stereotypes: A comparison between oral and ACASI reports in a national survey. Sociological Methods & Research, 51 (2): pp. 605–631, 2019. 8.
[72]
C. Stokes, V. Setlur, B. Cogley, A. Satyanarayan, and M. Hearst. Striking a balance: Reader takeaways and preferences when integrating text and charts. IEEE Transactions on Visualization and Computer Graphics, 2022.
[73]
J. Talbot, V. Setlur, and A. Anand. Four experiments on the perception of bar charts. IEEE Transactions on Visualization and Computer Graphics, 20 (12): pp. 2152–2160, 2014. 8.
[74]
P. S. Thomas, B. C. da Silva, A. G. Barto, S. Giguere, Y. Brun, and E. Brunskill. Preventing undesirable behavior of intelligent machines. Science, 366 (6468): pp. 999–1004, 2019. 1, 2, 9.
[75]
N. van Berkel et al. Effect of information presentation on fairness perceptions of machine learning predictors. In CHI. ACM, 2021. 1, 2, 6.
[76]
O. Vereschak, G. Bailly, and B. Caramiaux. How to evaluate trust in AI-assisted decision making? A survey of empirical methodologies. in Proceedings of the ACM Human-Computer Interactions, 5 (CSCW2), 2021. 2.
[77]
Q. Wang, Z. Xu, Z. Chen, Y. Wang, S. Liu, and H. Qu. Visual analysis of discrimination in machine learning. IEEE Transactions on Visualization and Computer Graphics, 27 (2), 2021. 2, 9.
[78]
R. Wang, F. M. Harper, and H. Zhu. Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences. In CHI, pp. 1–14. ACM, 2020. 1, 2.
[79]
J. Wexler, M. Pushkarna, T. Bolukbasi, M. Wattenberg, F. Viégas, and J. Wilson. The what-if tool: Interactive probing of machine learning models. IEEE Transactions on Visualization and Computer Graphics, 26 (1): pp. 56–65, 2020. 2,9.
[80]
T. Xie, Y. Ma, J. Kang, H. Tong, and R. Maciejewski. Fairrankvis: A visual analytics framework for exploring algorithmic fairness in graph mining models. IEEE Transactions on Visualization and Computer Graphics, 28 (1): pp. 368–377, 2022. 2,9.
[81]
C. Xiong, E. Lee-Robbins, I. Zhang, A. Gaba, and S. Franconeri. Reasoning affordances with tables and bar charts. IEEE transactions on visualization and computer graphics, 2022. 1, 3, 5.
[82]
C. Xiong, L. Padilla, K. Grayson, and S. Franconeri. Examining the components of trust in map-based visualizations. In EuroVis Workshop on Reproducibility, Verification, and Validation in Visualization (TrustVis), 2019. 1, 2.
[83]
C. Xiong, A. Sarvghad, Ç. Demiralp, J. M. Hofman, and D. G. Goldstein. Investigating perceptual biases in icon arrays. In CHI, 2022. 6.
[84]
C. Xiong, V. Setlur, B. Bach, K. Lin, E. Koh, and S. Franconeri. Visual arrangements of bar charts influence comparisons in viewer takeaways. IEEE Transactions on Visualization and Computer Graphics, 28 (1): pp. 955–965, 2021. 6.
[85]
C. Xiong, J. Shapiro, J. Hullman, and S. Franconeri. Illusion of causality in visualized data. IEEE Transactions on Visualization and Computer Graphics, 26 (1): pp. 853–862, 2019. 1.
[86]
C. Xiong, C. Stokes, A. Lovett, and S. Franconeri. What does the chart say? grouping cues guide viewer comparisons and conclusions in bar charts. IEEE Transactions on Visualization and Computer Graphics, 2023. 6.
[87]
C. Y. Y. Xiong. Perceptual and Cognitive Affordances of Data Visualizations. PhD thesis, Northwestern University, 2021. 1.
[88]
A. Yala, C. Lehman, T. Schuster, and T. P. and Regina Barzilay. A deep learning mammography-based model for improved breast cancer risk prediction. Radiology, 2019. 1.
[89]
M. Yin, J. Wortman Vaughan, and H. Wallach. Understanding the effect of accuracy on trust in machine learning models. In CHI, pp. 1–12. ACM, 2019. 1, 2.
[90]
J. Yuan, C. Chen, W. Yang, M. Liu, J. Xia, and S. Liu. A survey of visual analytics techniques for machine learning. Computational Visual Media, 7 (1): pp. 3–36, 2021. 2.
[91]
X. Zhang, J. P. Ono, H. Song, L. Gou, K.-L. Ma, and L. Ren. Sliceteller: A data slice-driven approach for machine learning model validation. IEEE Transactions on Visualization and Computer Graphics, 29 (1): pp. 842–852, 2023. 2, 9.
[92]
J. Zheng, E. Veinott, N. Bos, J. S. Olson, and G. M. Olson. Trust without touch: Jumpstarting long-distance trust with initial social activities. In CHI, pp. 141–146. ACM, 2002. 2, 9.
[93]
M. Zürn and S. Topolinski. When trust comes easy: Articulatory fluency increases transfers in the trust game. Journal of Economic Psychology, 61: pp. 74–86, 2017. 1.

Cited By

View all

Index Terms

  1. My Model is Unfair, Do People Even Care? Visual Design Affects Trust and Perceived Bias in Machine Learning
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Visualization and Computer Graphics
      IEEE Transactions on Visualization and Computer Graphics  Volume 30, Issue 1
      Jan. 2024
      1456 pages

      Publisher

      IEEE Educational Activities Department

      United States

      Publication History

      Published: 25 October 2023

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 01 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media