[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3411764.3445527acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open access

Fork It: Supporting Stateful Alternatives in Computational Notebooks

Published: 07 May 2021 Publication History

Abstract

Computational notebooks, which seamlessly interleave code with results, have become a popular tool for data scientists due to the iterative nature of exploratory tasks. However, notebooks provide a single execution state for users to manipulate through creating and manipulating variables. When exploring alternatives, data scientists must carefully create many-step manipulations in visually distant cells.
We conducted formative interviews with 6 professional data scientists, motivating design principles behind exposing multiple states. We introduce forking — creating a new interpreter session — and backtracking — navigating through previous states. We implement these interactions as an extension to notebooks that help data scientists more directly express and navigate through decision points a single notebook. In a qualitative evaluation, 11 professional data scientists found the tool would be useful for exploring alternatives and debugging code to create a predictive model. Their insights highlight further challenges to scaling this functionality.

Supplementary Material

VTT File (3411764.3445527_videofigurecaptions.vtt)
VTT File (3411764.3445527_videopreviewcaptions.vtt)
MP4 File (3411764.3445527_videofigure.mp4)
Supplemental video
MP4 File (3411764.3445527_videopreview.mp4)
Preview video

References

[1]
Sara Alspaugh, Nava Zokaei, Andrea Liu, Cindy Jin, and Marti A Hearst. 2018. Futzing and moseying: Interviews with professional data analysts on exploration practices. IEEE transactions on visualization and computer graphics 25, 1(2018), 22–31.
[2]
Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice(ICSE-SEIP ’19). IEEE Press, Hoboken, NJ, USA, 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042
[3]
Andrew Bragdon, Robert Zeleznik, Steven P. Reiss, Suman Karumuri, William Cheung, Joshua Kaplan, Christopher Coleman, Ferdi Adeputra, and Joseph J. LaViola. 2010. Code Bubbles: A Working Set-Based Interface for Code Understanding and Maintenance. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 2503–2512. https://doi.org/10.1145/1753326.1753706
[4]
Erin Cherry and Celine Latulipe. 2014. Quantifying the Creativity Support of Digital Tools through the Creativity Support Index. ACM Transactions on Computer-Human Interaction 21, 4 (Aug. 2014), 1–25. https://doi.org/10.1145/2617588
[5]
Robert DeLine, Andrew Bragdon, Kael Rowan, Jens Jacobsen, and Steven P. Reiss. 2012. Debugger Canvas: Industrial experience with the code bubbles paradigm. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, Hoboken, NJ, USA, 1064–1073. https://doi.org/10.1109/icse.2012.6227113
[6]
Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth. 1996. The KDD process for extracting useful knowledge from volumes of data. Commun. ACM 39, 11 (Nov. 1996), 27–34. https://doi.org/10.1145/240455.240464
[7]
Jonathan Goldstein, Ahmed Abdelhamid, Mike Barnett, Sebastian Burckhardt, Badrish Chandramouli, Darren Gehring, Niel Lebeck, Christopher Meiklejohn, Umar Farooq Minhas, Ryan Newton, Rahee Ghosh Peshawaria, Tal Zaccai, and Irene Zhang. 2020. A.M.B.R.O.S.I.A. Proceedings of the VLDB Endowment 13, 5 (Jan. 2020), 588–601. https://doi.org/10.14778/3377369.3377370
[8]
Björn Hartmann, Loren Yu, Abel Allison, Yeonsoo Yang, and Scott R. Klemmer. 2008. Design as Exploration: Creating Interface Alternatives through Parallel Authoring and Runtime Tuning. In Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology (Monterey, CA, USA) (UIST ’08). Association for Computing Machinery, New York, NY, USA, 91–100. https://doi.org/10.1145/1449715.1449732
[9]
Andrew Head, Fred Hohman, Titus Barik, Steven M. Drucker, and Robert DeLine. 2019. Managing Messes in Computational Notebooks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300500
[10]
S. Kandel, A. Paepcke, J. M. Hellerstein, and J. Heer. 2012. Enterprise Data Analysis and Visualization: An Interview Study. IEEE Transactions on Visualization and Computer Graphics 18, 12(2012), 2917–2926. https://doi.org/10.1109/TVCG.2012.219
[11]
Mary Beth Kery, Amber Horvath, and Brad Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 1265–1276. https://doi.org/10.1145/3025453.3025626
[12]
Mary Beth Kery, Bonnie E. John, Patrick O'Flaherty, Amber Horvath, and Brad A. Myers. 2019. Towards Effective Foraging by Data Scientists to Find Past Analysis Choices. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1–13. https://doi.org/10.1145/3290605.3300322
[13]
Mary Beth Kery and Brad A. Myers. 2017. Exploring exploratory programming. In 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, Hoboken, NJ, USA, 25–29. https://doi.org/10.1109/vlhcc.2017.8103446
[14]
Mary Beth Kery, Marissa Radensky, Mahima Arya, Bonnie E. John, and Brad A. Myers. 2018. The Story in the Notebook: Exploratory Data Science Using a Literate Programming Tool. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI '18. ACM Press, New York, NY, USA, 1–11. https://doi.org/10.1145/3173574.3173748
[15]
Donald Ervin Knuth. 1984. Literate programming. Comput. J. 27, 2 (1984), 97–111.
[16]
Jiali Liu, Nadia Boukhelifa, and James R. Eagan. 2019. Understanding the Role of Alternatives in Data Analysis Practices. IEEE Transactions on Visualization and Computer Graphics 26 (2019), 66–76. https://doi.org/10.1109/tvcg.2019.2934593
[17]
Hiroaki Mikami, Daisuke Sakamoto, and Takeo Igarashi. 2017. Micro-Versioning Tool to Support Experimentation in Exploratory Programming. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, 6208–6219. https://doi.org/10.1145/3025453.3025597
[18]
Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q. Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How Data Science Workers Work with Data. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1–15. https://doi.org/10.1145/3290605.3300356
[19]
Nicholas Nelson, A. Sarma, and A. Hoek. 2017. Towards an IDE to Support Programming as Problem-Solving. In PPIG. PPIG, Delft, NL, 15.
[20]
P. Pirolli and S. Card. 2005. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proceedings of International Conference on Intelligence Analysis. ACM Press, New York, NY, USA, 2–4. https://phibetaiota.net/wp-content/uploads/2014/12/Sensemaking-Process-Pirolli-and-Card.pdf
[21]
Bernadette M. Randles, Irene V. Pasquetto, Milena S. Golshan, and Christine L. Borgman. 2017. Using the Jupyter Notebook as a Tool for Open Science: An Empirical Study. In 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE, Hoboken, NJ, USA, 1–2. https://doi.org/10.1109/jcdl.2017.7991618
[22]
Adam Rule, Ian Drosos, Aurélien Tabard, and James D. Hollan. 2018. Aiding Collaborative Reuse of Computational Notebooks with Annotated Cell Folding. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (Nov. 2018), 1–12. https://doi.org/10.1145/3274419
[23]
Adam Rule, Aurélien Tabard, and James D. Hollan. 2018. Exploration and Explanation in Computational Notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI '18. ACM Press, New York, NY, USA, 1–12. https://doi.org/10.1145/3173574.3173606
[24]
Kluyver Thomas, Ragan-Kelley Benjamin, Perez Fernando, Granger Brian, Bussonnier Matthias, Frederic Jonathan, Kelley Kyle, Hamrick Jessica, Grout Jason, Corlay Sylvain, and et al.2016. Jupyter Notebooks: a publishing format for reproducible computational workflows. Stand Alone 0, Positioning and Power in Academic Publishing: Players, Agents and Agendas (2016), 87–90. https://doi.org/10.3233/978-1-61499-649-1-87
[25]
Mark Weiser. 1984. Program Slicing. IEEE Transactions on Software Engineering SE-10, 4 (July 1984), 352–357. https://doi.org/10.1109/tse.1984.5010248
[26]
Greg Wilson, D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, Kathryn D. Huff, Ian M. Mitchell, Mark D. Plumbley, Ben Waugh, Ethan P. White, and Paul Wilson. 2014. Best Practices for Scientific Computing. PLoS Biology 12, 1 (Jan. 2014), e1001745. https://doi.org/10.1371/journal.pbio.1001745

Cited By

View all
  • (2024)NotePlayer: Engaging Computational Notebooks for Dynamic Presentation of Analytical ProcessesProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676410(1-20)Online publication date: 13-Oct-2024
  • (2024)Multiverse Notebook: Shifting Data Scientists to Time TravelersProceedings of the ACM on Programming Languages10.1145/36498388:OOPSLA1(754-783)Online publication date: 29-Apr-2024
  • (2024)Don't Step on My Toes: Resolving Editing Conflicts in Real-Time Collaboration in Computational NotebooksProceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments10.1145/3643796.3648453(47-52)Online publication date: 20-Apr-2024
  • Show More Cited By

Index Terms

  1. Fork It: Supporting Stateful Alternatives in Computational Notebooks
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
    May 2021
    10862 pages
    ISBN:9781450380966
    DOI:10.1145/3411764
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Alternatives
    2. code history
    3. computational notebooks
    4. exploratory programming

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CHI '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

    Upcoming Conference

    CHI 2025
    ACM CHI Conference on Human Factors in Computing Systems
    April 26 - May 1, 2025
    Yokohama , Japan

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)325
    • Downloads (Last 6 weeks)52
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)NotePlayer: Engaging Computational Notebooks for Dynamic Presentation of Analytical ProcessesProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676410(1-20)Online publication date: 13-Oct-2024
    • (2024)Multiverse Notebook: Shifting Data Scientists to Time TravelersProceedings of the ACM on Programming Languages10.1145/36498388:OOPSLA1(754-783)Online publication date: 29-Apr-2024
    • (2024)Don't Step on My Toes: Resolving Editing Conflicts in Real-Time Collaboration in Computational NotebooksProceedings of the 1st ACM/IEEE Workshop on Integrated Development Environments10.1145/3643796.3648453(47-52)Online publication date: 20-Apr-2024
    • (2024)SHARP: Exploring Version Control Systems in Live Coding MusicProceedings of the 16th Conference on Creativity & Cognition10.1145/3635636.3656195(426-437)Online publication date: 23-Jun-2024
    • (2024)SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational NotebooksExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650848(1-17)Online publication date: 11-May-2024
    • (2024)Human-Notebook Interactions: The CHI of Computational NotebooksExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3636318(1-6)Online publication date: 11-May-2024
    • (2024)Evaluating Navigation and Comparison Performance of Computational Notebooks on Desktop and in Virtual RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642932(1-15)Online publication date: 11-May-2024
    • (2024)OutlineSpark: Igniting AI-powered Presentation Slides Creation from Computational Notebooks through OutlinesProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642865(1-16)Online publication date: 11-May-2024
    • (2024)Loops: Leveraging Provenance and Visualization to Support Exploratory Data Analysis in NotebooksIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345618631:1(1213-1223)Online publication date: 23-Sep-2024
    • (2023)Exploring and Evaluating the Potential of 2D Computational NotebooksCompanion Proceedings of the 2023 Conference on Interactive Surfaces and Spaces10.1145/3626485.3626554(97-99)Online publication date: 5-Nov-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media