[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3478431.3499384acmconferencesArticle/Chapter ViewAbstractPublication PagessigcseConference Proceedingsconference-collections
research-article
Open access

How Computer Science and Statistics Instructors Approach Data Science Pedagogy Differently: Three Case Studies

Published: 22 February 2022 Publication History

Abstract

Over the past decade, data science courses have been growing more popular across university campuses. These courses often involve a mix of programming and statistics and are taught by instructors from diverse backgrounds. In our experiences launching a data science program at a large public U.S. university over the past four years, we noticed one central tension within many such courses: instructors must finely balance how much computing versus statistics to teach in the limited available time. In this experience report, we provide a detailed firsthand reflection on how we have personally balanced these two major topic areas within several offerings of a large introductory data science course that we taught and wrote an accompanying textbook for; our course has served several thousand students over the past four years. We present three case studies from our experiences to illustrate how computer science and statistics instructors approach data science differently on topics ranging from algorithmic depth to modeling to data acquisition. We then draw connections to deeper tradeoffs in data science to help guide instructors who design interdisciplinary courses. We conclude by suggesting ways that instructors can incorporate both computer science and statistics perspectives to improve data science teaching.

References

[1]
2021. Dewey Defeats Truman. Wikipedia (July 2021).
[2]
Joel C. Adams. 2020. Creating a Balanced Data Science Program. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education. 185--191.
[3]
Genevera I. Allen. 2021. Experiential Learning in Data Science: Developing an Interdisciplinary, Client-Sponsored Capstone Program. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, New York, NY, USA, 516--522.
[4]
Paul Anderson, James Bowring, Renée McCauley, George Pothering, and Christopher Starr. 2014. An Undergraduate Degree in Data Science: Curriculum and a Decade of Implementation Experience. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE '14). Association for Computing Machinery, New York, NY, USA, 145--150. https://doi.org/10.1145/2538862. 2538936
[5]
Austin Cory Bart, Dennis Kafura, Clifford A. Shaffer, and Eli Tilevich. 2018. Reconciling the Promise and Pragmatics of Enhancing Computing Pedagogy with Data Science. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE '18). Association for Computing Machinery, New York, NY, USA, 1029--1034. https://doi.org/10.1145/3159450.3159465
[6]
Ben Baumer. 2015. A Data Science Course for Undergraduates: Thinking with Data. The American Statistician 69, 4 (2015), 334--342.
[7]
Ismail Bile Hassan, Thanaa Ghanem, David Jacobson, Simon Jin, Katherine Johnson, Dalia Sulieman, and Wei Wei. 2021. Data Science Curriculum Design: A Case Study. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, New York, NY, USA, 529--534.
[8]
Avrim Blum, John Hopcroft, and Ravindran Kannan. 2020. Foundations of Data Science. Cambridge University Press.
[9]
Joshua Blumenstock, Gabriel Cadamuro, and Robert On. 2015. Predicting Poverty and Wealth from Mobile Phone Metadata. Science 350, 6264 (Nov. 2015), 1073-- 1076. https://doi.org/10.1126/science.aac4420
[10]
Leo Breiman. 2001. Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author). Statistical science 16, 3 (2001), 199--231.
[11]
Thomas C. Bressoud and Gavin Thomas. 2019. A Novel Course in Data Systems with Minimal Prerequisites. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE '19). Association for Computing Machinery, New York, NY, USA, 15--21. https://doi.org/10.1145/3287324.3287425
[12]
Raj Chetty, Nathaniel Hendren, Patrick Kline, and Emmanuel Saez. 2014. Where Is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States. The Quarterly Journal of Economics 129, 4 (2014), 1553--1623.
[13]
William S. Cleveland. 2001. Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics. International statistical review 69, 1 (2001), 21--26.
[14]
Committee on Envisioning the Data Science Discipline: The Undergraduate Perspective, Computer Science and Telecommunications Board, Board on Mathematical Sciences and Analytics, Committee on Applied and Theoretical Statistics, Division on Engineering and Physical Sciences, Board on Science Education, Division of Behavioral and Social Sciences and Education, and National Academies of Sciences, Engineering, and Medicine. 2018. Envisioning the Data Science Discipline: The Undergraduate Perspective: Interim Report. National Academies Press, Washington, D.C. https://doi.org/10.17226/24886
[15]
National Research Council. 2013. Nonresponse in Social Science Surveys: A Research Agenda. National Academies Press.
[16]
Sarah Dahlby Albright, Titus H. Klinge, and Samuel A. Rebelsky. 2018. A Functional Approach to Data Science in CS1. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE '18). Association for Computing Machinery, New York, NY, USA, 1035--1040. https://doi.org/10.1145/3159450. 3159550
[17]
Thomas H. Davenport and D. J. Patil. 2012. Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review (Oct. 2012).
[18]
Richard D. De Veaux, Mahesh Agarwal, Maia Averett, Benjamin S. Baumer, Andrew Bray, Thomas C. Bressoud, Lance Bryant, Lei Z. Cheng, Amanda Francis, and Robert Gould. 2017. Curriculum Guidelines for Undergraduate Programs in Data Science. Annual Review of Statistics and Its Application 4 (2017), 15--30.
[19]
John DeNero. [n. d.]. Data 8. http://data8.org/.
[20]
Alan Fekete, Judy Kay, and Uwe Röhm. 2021. A Data-Centric Computing Curriculum for a Data Science Major. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, New York, NY, USA, 865--871.
[21]
J. Hardin, R. Hoerl, Nicholas J. Horton, D. Nolan, B. Baumer, O. Hall-Holt, P. Murrell, R. Peng, P. Roback, D. Temple Lang, and M. D. Ward. 2015. Data Science in Statistics Curricula: Preparing Students to "Think with Data". The American Statistician 69, 4 (Oct. 2015), 343--353. https://doi.org/10.1080/00031305.2015. 1077729
[22]
Jessen Havill. 2019. Embracing the Liberal Arts in an Interdisciplinary Data Analytics Program. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education (SIGCSE '19). Association for Computing Machinery, New York, NY, USA, 9--14. https://doi.org/10.1145/3287324.3287436
[23]
Tony Hey, Stewart Tansley, and Kristin Tolle. 2009. The Fourth Paradigm: DataIntensive Scientific Discovery.
[24]
Samuel Lau, Joseph Gonzalez, and Deborah Nolan. 2018. Principles and Techniques of Data Science.
[25]
David Lazer, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. The Parable of Google Flu: Traps in Big Data Analysis. Science 343, 6176 (2014), 1203--1205.
[26]
Dong C. Liu and Jorge Nocedal. 1989. On the Limited Memory BFGS Method for Large Scale Optimization. Mathematical programming 45, 1 (1989), 503--528.
[27]
Adam Loy, Shonda Kuiper, and Laura Chihara. 2019. Supporting Data Science in the Statistics Curriculum. Journal of Statistics Education 27, 1 (2019), 2--11.
[28]
Wes McKinney. 2011. Pandas: A Foundational Python Library for Data Analysis and Statistics. Python for high performance and scientific computing 14, 9 (2011), 1--9.
[29]
Kate Milner and Jonathan Rougier. 2014. How to weigh a donkey in the Kenyan countryside. Significance 11, 4 (2014), 40--43.
[30]
Deborah Nolan and Duncan Temple Lang. 2010. Computing in the Statistics Curricula. The American Statistician 64, 2 (2010), 97--107.
[31]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, and Vincent Dubourg. 2011. Scikit-Learn: Machine Learning in Python. the Journal of machine Learning research 12 (2011), 2825--2830.
[32]
Devin Petersohn, Stephen Macke, Doris Xin, William Ma, Doris Lee, Xiangxi Mo, Joseph E. Gonzalez, Joseph M. Hellerstein, Anthony D. Joseph, and Aditya Parameswaran. 2020. Towards Scalable Dataframe Systems. arXiv preprint arXiv:2001.00888 (2020). arXiv:2001.00888
[33]
Bina Ramamurthy. 2016. A Practical and Sustainable Model for Learning and Teaching Data Science. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (SIGCSE '16). Association for Computing Machinery, New York, NY, USA, 169--174. https://doi.org/10.1145/2839509.2844603
[34]
Suraj Rampure, Allen Shen, and Josh Hug. 2021. Experiences Teaching a Large Upper-Division Data Science Course Remotely. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, New York, NY, USA, 523--528.
[35]
Stephanie Rosenthal and Tingting Chung. 2020. A Data Science Major: Building Skills and Confidence. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education. 178--184.
[36]
Mariam Salloum, Daniel Jeske, Wenxiu Ma, Vagelis Papalexakis, Christian Shelton, Vassilis Tsotras, and Shuheng Zhou. 2021. Developing an Interdisciplinary Data Science Program. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education. 509--515.
[37]
Harvard Business School. [n. d.]. What is the Case Study Method? https://www. exed.hbs.edu/hbs-experience/learning-experience/case-study-method. Accessed: 2021-08--13.
[38]
Houshmand Shirani-Mehr, David Rothschild, Sharad Goel, and Andrew Gelman. 2018. Disentangling Bias and Variance in Election Polls. J. Amer. Statist. Assoc. 113, 522 (2018), 607--614.
[39]
Wei Wang, David Rothschild, Sharad Goel, and Andrew Gelman. 2015. Forecasting Elections with Non-Representative Polls. International Journal of Forecasting 31, 3 (2015), 980--991.

Cited By

View all

Index Terms

  1. How Computer Science and Statistics Instructors Approach Data Science Pedagogy Differently: Three Case Studies

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCSE 2022: Proceedings of the 53rd ACM Technical Symposium on Computer Science Education - Volume 1
    February 2022
    1049 pages
    ISBN:9781450390705
    DOI:10.1145/3478431
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 February 2022

    Check for updates

    Author Tags

    1. case studies
    2. data science
    3. programming education
    4. statistics

    Qualifiers

    • Research-article

    Funding Sources

    • NSF
    • Alfred P. Sloan Foundation

    Conference

    SIGCSE 2022
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,595 of 4,542 submissions, 35%

    Upcoming Conference

    SIGCSE TS 2025
    The 56th ACM Technical Symposium on Computer Science Education
    February 26 - March 1, 2025
    Pittsburgh , PA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 492
      Total Downloads
    • Downloads (Last 12 months)142
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 13 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media