More Web Proxy on the site http://driver.im/

research-article

Public Access

Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Authors:

Katharina ReineckeAuthors Info & Claims

UIST '19: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology

Pages 591 - 603

https://doi.org/10.1145/3332165.3347940

Published: 17 October 2019 Publication History

Abstract

Though statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.

Supplementary Material

MP4 File (ufp8521pv.mp4)

Preview video

Download
4.18 MB

MP4 File (p591-jun.mp4)

Download
560.81 MB

References

[1]

American Psychological Association. 1996. Task Force on Statistical Inference. (1996). https://www.apa.org/science/leadership/bsa/statistical/

[2]

American Psychological Association and others. 1983. Publication manual. American Psychological Association Washington, DC.

[3]

Eytan Bakshy, Dean Eckles, and Michael S Bernstein. 2014. Designing and deploying online field experiments. In Proceedings of the 23rd international conference on World wide web. ACM, 283--292.

Digital Library

[4]

J. Bruin. 2019. Choosing the Correct Statistical Test in SAS, Stata, SPSS and R. (2019). https://stats.idre.ucla.edu/other/mult-pkg/whatstat/

[5]

Andreas Buja, Dianne Cook, Heike Hofmann, Michael Lawrence, Eun-Kyung Lee, Deborah F Swayne, and Hadley Wickham. 2009. Statistical inference for exploratory data analysis and model diagnostics. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367, 1906 (2009), 4361--4383.

[6]

Paul Cairns. 2007. HCI... not as it should be: inferential statistics in HCI research. In Proceedings of the 21st British HCI Group Annual Conference on People and Computers: HCI... but not as we know it-Volume 1. British Computer Society, 195--201.

[7]

Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan : A Probabilistic Programming Language. Journal of Statistical Software 76 (01 2017). http://dx.doi.org/10.18637/jss.v076.i01

[8]

Andy Cockburn, Carl Gutwin, and Alan Dix. 2018. Hark no more: on the preregistration of chi experiments. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 141.

Digital Library

[9]

Jacob Cohen. 1988. Statistical power analysis for the social sciences. (1988).

[10]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337--340.

Digital Library

[11]

Pierre Dragicevic. 2016. Fair statistical communication in HCI. In Modern Statistical Methods for HCI. Springer, 291--330.

[12]

Bradley Efron. 1992. Bootstrap methods: another look at the jackknife. In Breakthroughs in statistics. Springer, 569--593.

[13]

Isaac Ehrlich. 1973. Participation in illegitimate activities: A theoretical and empirical investigation. Journal of political Economy 81, 3 (1973), 521--565.

[14]

Alexander Eiselmayer, Chatchavan Wacharamanotham, Michel Beaudouin-Lafon, and Wendy Mackay. 2019. Touchstone2: An Interactive Environment for Exploring Trade-offs in HCI Experiment Design. (2019).

[15]

Andy Field, Jeremy Miles, and Zoë Field. 2012. Discovering statistics using R. Sage publications.

Digital Library

[16]

Ronald Aylmer Fisher. 1937. The design of experiments. Oliver And Boyd; Edinburgh; London.

[17]

Jonah Gabry, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. Visualization in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society) 182, 2 (2019), 389--402.

[18]

N. D. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. 2008. Church: a language for generative models. Uncertainty in Artificial Intelligence (2008).

[19]

Francc ois Guimbretière, Morgan Dixon, and Ken Hinckley. 2007. ExperiScope: an analysis tool for interaction data. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 1333--1342.

Digital Library

[20]

Jeffrey Heer. 2019. Agency plus automation: Designing artificial intelligence into interactive systems. Proceedings of the National Academy of Sciences 116, 6 (2019), 1844--1850.

[21]

Jane Hoffswell, Alan Borning, and Jeffrey Heer. 2018. SetCoLa: High-Level Constraints for Graph Layout. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 537--548.

[22]

Sture Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics (1979), 65--70.

[23]

Eric Jones, Travis Oliphant, Pearu Peterson, and others. 2001--2019. SciPy: Open source scientific tools for Python. (2001--2019). http://www.scipy.org/

[24]

Robert I Kabacoff. 2011. R: In Action. (2011).

Digital Library

[25]

Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffrey Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3363--3372.

Digital Library

[26]

Maurits Kaptein and Judy Robertson. 2012. Rethinking statistical analysis methods for CHI. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1105--1114.

Digital Library

[27]

Matthew Kay, Gregory L Nelson, and Eric B Hekler. 2016. Researcher-centered design of statistics: Why Bayesian statistics better fit the culture and incentives of HCI. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 4521--4532.

Digital Library

[28]

Norbert L Kerr. 1998. HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review 2, 3 (1998), 196--217.

[29]

Scott Klemmer and Jacob Wobbrock. 2019. Designing, Running, and Analyzing Experiments. (2019). https://www.coursera.org/learn/designexperiments

[30]

John K. Kruschke. 2010. Doing Bayesian Data Analysis: A Tutorial with R and BUGS (1st ed.). Academic Press, Inc., Orlando, FL, USA.

[31]

John K. Kruschke and Torrin M. Liddell. 2018. The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychonomic Bulletin & Review 25, 1 (01 Feb 2018), 178--206. http://dx.doi.org/10.3758/s13423-016--1221--4

[32]

Kent State University Libraries. 2019. SPSS Tutorials: Analyzing Data. (2019). https://libguides.library.kent.edu/SPSS/AnalyzeData

[33]

Calvin Loncaric, Emina Torlak, and Michael D Ernst. 2016. Fast synthesis of fast collections. ACM SIGPLAN Notices 51, 6 (2016), 355--368.

Digital Library

[34]

Thomas Lumley, Paula Diehr, Scott Emerson, and Lu Chen. 2002. The importance of the normality assumption in large public health data sets. Annual review of public health 23, 1 (2002), 151--169.

[35]

David J. Lunn, Andrew Thomas, Nicky Best, and David Spiegelhalter. 2000. WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing 10, 4 (01 Oct 2000), 325--337. http://dx.doi.org/10.1023/A:1008929526011

[36]

Wendy E Mackay, Caroline Appert, Michel Beaudouin-Lafon, Olivier Chapuis, Yangzhou Du, Jean-Daniel Fekete, and Yves Guiard. 2007. Touchstone: exploratory design of experiments. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 1425--1434.

Digital Library

[37]

Michael E. J. Masson. 2011. A tutorial on a practical Bayesian alternative to null-hypothesis significance testing. Behavior Research Methods 43, 3 (Sept. 2011), 679--690. http://dx.doi.org/10.3758/s13428-010-0049--5

[38]

Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, and Andrey Kolobov. 2005. BLOG: Probabilistic Models with Unknown Objects. In Proc. 19th International Joint Conference on Artificial Intelligence. 1352--1359. http://sites.google.com/site/bmilch/papers/blog-ijcai05.pdf

[39]

Dominik Moritz, Chenglong Wang, Greg L Nelson, Halden Lin, Adam M Smith, Bill Howe, and Jeffrey Heer. 2019. Formalizing visualization design knowledge as constraints: Actionable and extensible models in Draco. IEEE transactions on visualization and computer graphics 25, 1 (2019), 438--448.

[40]

Travis E Oliphant. 2006. A guide to NumPy. Vol. 1. Trelgol Publishing USA.

Digital Library

[41]

Pavel Panchekha, Adam T Geller, Michael D Ernst, Zachary Tatlock, and Shoaib Kamil. 2018. Verifying that web pages have accessible layout. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 1--14.

Digital Library

[42]

Avi Pfeffer. 2011. Practical Probabilistic Programming. In Inductive Logic Programming, Paolo Frasconi and Francesca A. Lisi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 2--3.

Digital Library

[43]

Alex Reinhart. 2015. Statistics done wrong: The woefully complete guide. No starch press.

[44]

Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer. 2017. Vega-lite: A grammar of interactive graphics. IEEE transactions on visualization and computer graphics 23, 1 (2017), 341--350.

Digital Library

[45]

Skipper Seabold and Josef Perktold. 2010. Statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference, Vol. 57. Scipy, 61.

[46]

Amanda Swearngin, Andrew J Ko, and James Fogarty. 2018. Scout: Mixed-Initiative Exploration of Design Variations through High-Level Design Constraints. In The 31st Annual ACM Symposium on User Interface Software and Technology Adjunct Proceedings. ACM, 134--136.

Digital Library

[47]

Walter Vandaele. 1987. Participation in illegitimate activities: Ehrlich revisited, 1960. Vol. 8677. Inter-university Consortium for Political and Social Research.

[48]

András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25, 2 (2000), 101--132.

[49]

William N Venables and Brian D Ripley. 2013. Modern applied statistics with S-PLUS. Springer Science & Business Media.

Digital Library

[50]

Chat Wacharamanotham, Krishna Subramanian, Sarah Theres Volkel, and Jan Borchers. 2015. Statsplorer: Guiding novices in statistical analysis. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 2693--2702.

Digital Library

[51]

Hadley Wickham and others. 2014. Tidy data. Journal of Statistical Software 59, 10 (2014), 1--23.

[52]

Wikipedia contributors. 2019a. JMP (statistical software) -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=JMP_(statistical_software)&oldid=887217350. (2019). [Online; accessed 5-April-2019].

[53]

Wikipedia contributors. 2019b. R (programming language) -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=R_(programming_language)&oldid=890657071. (2019). [Online; accessed 5-April-2019].

[54]

Wikipedia contributors. 2019c. SAS (software) -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=SAS_(software)&oldid=890451452. (2019). [Online; accessed 5-April-2019].

[55]

Wikipedia contributors. 2019d. SPSS -- Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=SPSS&oldid=888470477. (2019). [Online; accessed 5-April-2019].

[56]

Leland Wilkinson. 1999. Statistical methods in psychology journals: Guidelines and explanations. American psychologist 54, 8 (1999), 594.

Cited By

Savage VPüsök NGoldstein HNandi CRen JOehlberg L(2024)Demonstrating FEDT: Supporting Characterization Experiments in Fabrication ResearchAdjunct Proceedings of the 9th ACM Symposium on Computational Fabrication10.1145/3665662.3673270(1-3)Online publication date: 7-Jul-2024
https://dl.acm.org/doi/10.1145/3665662.3673270
Liang JBadea CBird CDeLine RFord DForsgren NZimmermann T(2024)Can GPT-4 Replicate Empirical Software Engineering Research?Proceedings of the ACM on Software Engineering10.1145/36607671:FSE(1330-1353)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660767
Jun EMisback EHeer JJust R(2024)rTisane: Externalizing conceptual models for data analysis prompts reconsideration of domain assumptions and facilitates statistical modelingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642267(1-16)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642267
Show More Cited By

Index Terms

Tea: A High-level Language and Runtime System for Automating Statistical Analysis
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools
      1. User interface toolkits

Recommendations

Evolution of symposia on the interface of computing and statistics defines data science to be the interface
Goal of this article is to document evolution of the Interface and its Symposia, from their conception and birth when small data were analyzed with Statistics in the mid 20th Century until Big Data are now analyzed with Data Science in the early 21st ...
Statistical Significance Testing at CHI PLAY: Challenges and Opportunities for More Transparency
CHI PLAY '20: Proceedings of the Annual Symposium on Computer-Human Interaction in Play

Statistical Significance Testing -- or Null Hypothesis Significance Testing (NHST) -- is common to quantitative CHI PLAY research. Drawing from recent work in HCI and psychology promoting transparent statistics and the reduction of questionable research ...
Determining the significance and relative importance of parameters of a simulated quenching algorithm using statistical tools

When search methods are being designed it is very important to know which parameters have the greatest influence on the behaviour and performance of the algorithm. To this end, algorithm parameters are commonly calibrated by means of either theoretic ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

UIST '19: Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology

October 2019

1229 pages

ISBN:9781450368162

DOI:10.1145/3332165

General Chair:
François Guimbretière
Cornell University, USA
,
Program Chairs:
Michael Bernstein
Stanford University, USA
,
Katharina Reinecke
University of Washington, USA

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

UIST '19

Sponsor:

UIST '19: The 32nd Annual ACM Symposium on User Interface Software and Technology

October 20 - 23, 2019

LA, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25

Sponsor:
sigchi
sigchi

The 38th Annual ACM Symposium on User Interface Software and Technology

September 28 - October 1, 2025

Busan , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
986
Total Downloads

Downloads (Last 12 months)263
Downloads (Last 6 weeks)41

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Savage VPüsök NGoldstein HNandi CRen JOehlberg L(2024)Demonstrating FEDT: Supporting Characterization Experiments in Fabrication ResearchAdjunct Proceedings of the 9th ACM Symposium on Computational Fabrication10.1145/3665662.3673270(1-3)Online publication date: 7-Jul-2024
https://dl.acm.org/doi/10.1145/3665662.3673270
Liang JBadea CBird CDeLine RFord DForsgren NZimmermann T(2024)Can GPT-4 Replicate Empirical Software Engineering Research?Proceedings of the ACM on Software Engineering10.1145/36607671:FSE(1330-1353)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660767
Jun EMisback EHeer JJust R(2024)rTisane: Externalizing conceptual models for data analysis prompts reconsideration of domain assumptions and facilitates statistical modelingProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642267(1-16)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642267
Gu KGrunde-McLaughlin MMcNutt AHeer JAlthoff T(2024)How Do Data Analysts Respond to AI Assistance? A Wizard-of-Oz StudyProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3641891(1-22)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3641891
Zhang YPerer AEpperson W(2024)Guided Statistical Workflows with Interactive Explanations and Assumption Checking2024 IEEE Visualization and Visual Analytics (VIS)10.1109/VIS55277.2024.00013(26-30)Online publication date: 13-Oct-2024
https://doi.org/10.1109/VIS55277.2024.00013
Masson DMalacria SCasiez GVogel D(2023)Statslator: Interactive Translation of NHST and Estimation Statistics Reporting Styles in Scientific DocumentsProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606762(1-14)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3586183.3606762
Gu KJun EAlthoff T(2023)Understanding and Supporting Debugging Workflows in Multiverse AnalysisProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581099(1-19)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581099
Petricek TBurg GNazábal ACeritli TJiménez-Ruiz EWilliams C(2023)AI Assistants: A Framework for Semi-Automated Data WranglingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322253835:9(9295-9306)Online publication date: 1-Sep-2023
https://doi.org/10.1109/TKDE.2022.3222538
Kortum P(2022)Where's my jetpack?Interactions10.1145/355190029:5(68-71)Online publication date: 30-Aug-2022
https://dl.acm.org/doi/10.1145/3551900
Jun E(2022)Empowering domain experts to author valid statistical analysesAdjunct Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526114.3558530(1-5)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526114.3558530
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents