[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3408877.3432539acmconferencesArticle/Chapter ViewAbstractPublication PagessigcseConference Proceedingsconference-collections
Article
Public Access

Autograding "Explain in Plain English" questions using NLP

Published: 05 March 2021 Publication History

Abstract

Previous research suggests that "Explain in Plain English" (EiPE) code reading activities could play an important role in the development of novice programmers, but EiPE questions aren't heavily used in introductory programming courses because they (traditionally) required manual grading. We present what we believe to be the first automatic grader for EiPE questions and its deployment in a large-enrollment introductory programming course. Based on a set of questions deployed on a computer-based exam, we find that our implementation has an accuracy of 87-89%, which is similar in performance to course teaching assistants trained to perform this task and compares favorably to automatic short answer grading algorithms developed for other domains. In addition, we briefly characterize the kinds of answers that the current autograder fails to score correctly and the kinds of errors made by students.

References

[1]
Owen Astrachan and David Reed. 1995. AAA and CS 1: The Applied Apprenticeship Approach to CS 1. In Proceedings of the Twenty-sixth SIGCSE Technical Symposium on Computer Science Education (SIGCSE '95). ACM, New York, NY, USA, 1--5. https://doi.org/10.1145/199688.199694
[2]
Sushmita Azad, Binglin Chen, Maxwell Fowler, Matthew West, and Craig Zilles. 2020. Strategies for Deploying Unreliable AI Graders in High-Transparency High-Stakes Exams. In International Conference on Artificial Intelligence in Education. Springer, 16--28.
[3]
Steven Burrows, Iryna Gurevych, and Benno Stein. 2015. The Eras and Trends of Automatic Short Answer Grading. International Journal of Artificial Intelligence in Education, Vol. 25, 1 (01 Mar 2015), 60--117. https://doi.org/10.1007/s40593-014-0026--8
[4]
Binglin Chen, Sushmita Azad, Rajarshi Haldar, Matthew West, and Craig Zilles. 2020. A Validated Scoring Rubric for Explain-in-Plain-English Questions. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE) .
[5]
Binglin Chen, Matthew West, and Craig Zilles. 2018. How Much Randomization is Needed to Deter Collaborative Cheating on Asynchronous Exams?. In Learning at Scale .
[6]
Michael J. Clancy and Marcia C. Linn. 1999. Patterns and Pedagogy. In The Proceedings of the Thirtieth SIGCSE Technical Symposium on Computer Science Education (SIGCSE '99). ACM, New York, NY, USA, 37--42.
[7]
Malcolm Corney, Sue Fitzgerald, Brian Hanks, Raymond Lister, Renee McCauley, and Laurie Murphy. 2014. 'Explain in Plain English' Questions Revisited: Data Structures Problems. In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE '14). ACM, New York, NY, USA, 591--596. http://doi.acm.org/10.1145/2538862.2538911
[8]
Malcolm Corney, Raymond Lister, and Donna Teague. 2011. Early Relational Reasoning and the Novice Programmer: Swapping As the "Hello World" of Relational Reasoning. In Proceedings of the Thirteenth Australasian Computing Education Conference - Volume 114 (ACE '11). 95--104.
[9]
M. O. Dzikovska et almbox. 2013. SemEval-2013 task 7: The joint student response analysis and eighth recognizing textual entailment challenge. In Proceedings of the 2nd joint conference on lexical and computational semantics, M. Diab, T. Baldwin, and M. Baroni (Eds.). 1--12.
[10]
Hewlett Foundation. 2012. Automated student assessment prize: Phase two -- short answer scoring, Kaggle Competition.
[11]
Lucas Busatta Galhardi and Jacques Duílio Brancher. 2018. Machine Learning Approach for Automatic Short Answer Grading: A Systematic Review. In Advances in Artificial Intelligence - IBERAMIA 2018, Guillermo R. Simari, Eduardo Fermé, Flabio Gutiérrez Segura, and José Antonio Rodríguez Melquiades (Eds.). Springer International Publishing, Cham, 380--391.
[12]
Fernand Gobet, Peter CR Lane, Steve Croker, Peter CH Cheng, Gary Jones, Iain Oliver, and Julian M Pine. 2001. Chunking mechanisms in human learning. Trends in cognitive sciences, Vol. 5, 6 (2001), 236--243.
[13]
Wael Hassan Gomaa and Aly Aly Fahmy. 2020. Ans2vec: A Scoring System for Short Answers. In The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019), Aboul Ella Hassanien, Ahmad Taher Azar, Tarek Gaber, Roheet Bhatnagar, and Mohamed F. Tolba (Eds.). Springer International Publishing, Cham, 586--595.
[14]
Sarah J. Hatteberg and Kody Steffy. 2013. Increasing Reading Compliance of Undergraduates: An Evaluation of Compliance Methods. Teaching Sociology, Vol. 41, 4 (2013), 346--352. https://doi.org/10.1177/0092055X13490752
[15]
Vighnesh Iyer and Craig Zilles. 2021. Pattern Census: A Characterization of Pattern Usage in Early Programming Courses. In Proceedings of the SIGCSE Technical Symposium (SIGCSE) .
[16]
Yaman Kumar, Swati Aggarwal, Debanjan Mahata, Rajiv Ratn Shah, Ponnurangam Kumaraguru, and Roger Zimmermann. 2019. Get IT Scored Using AutoSAS ? An Automated System for Scoring Short Answers . Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 01 (July 2019), 9662--9669. https://doi.org/10.1609/aaai.v33i01.33019662
[17]
Raymond Lister, Elizabeth S Adams, Sue Fitzgerald, William Fone, John Hamer, Morten Lindholm, Robert McCartney, Jan Erik Moström, Kate Sanders, Otto Sepp"al"a, Beth Simon, and Lynda Thomas. 2004. A multi-national study of reading and tracing skills in novice programmers. ACM SIGCSE Bulletin, Vol. 36, 4 (2004), 119--150.
[18]
Raymond Lister, Colin Fidge, and Donna Teague. 2009. Further Evidence of a Relationship Between Explaining, Tracing and Writing Skills in Introductory Programming. In Proceedings of the 14th Annual ACM SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE '09). ACM, New York, NY, USA, 161--165. https://doi.org/10.1145/1562877.1562930
[19]
Tiaoqiao Liu, Wenbiao Ding, Zhiwei Wang, Jiliang Tang, Gale Yan Huang, and Zitao Liu. 2019. Automatic Short Answer Grading via Multiway Attention Networks . arXiv:1909.10166 [cs] (2019). http://arxiv.org/abs/1909.10166
[20]
Mike Lopez, Jacqueline Whalley, Phil Robbins, and Raymond Lister. 2008. Relationships between reading, tracing and writing skills in introductory programming. In Proceedings of the Fourth International Workshop on Computing Education Research. ACM, 101--112.
[21]
Ahmed Magooda, Mohamed A. Zahran, Mohsen Rashwan, Hazem M. Raafat, and Magda B. Fayek. 2016. Vector Based Techniques for Short Answer Grading. In FLAIRS Conference .
[22]
Alvarado Mantecon and Jesus Gerardo. 2019. Towards the Automatic Classification of Student Answers to Open-ended Questions. Thesis. Université d'Ottawa / University of Ottawa. https://doi.org/10.20381/ruor-23341
[23]
Sandra P Marshall. 1995. Schemas in problem solving .Cambridge University Press.
[24]
Michael McCracken et almbox. 2001. A Multi-national, Multi-institutional Study of Assessment of Programming Skills of First-year CS Students. In Working Group Reports from ITiCSE on Innovation and Technology in Computer Science Education (ITiCSE-WGR '01). ACM, New York, NY, USA, 125--180.
[25]
Katherine B McKeithen, Judith Spencer Reitman, Henry H Rueter, and Stephen C Hirtle. 1981. Knowledge organization and skill differences in computer programmers. Cognitive Psychology, Vol. 13, 3 (1981), 307--325.
[26]
M. Mohler, R. Bunescu, and R. Mihalcea. 2011. Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies . 752--762.
[27]
Laurie Murphy, Renée McCauley, and Sue Fitzgerald. 2012. 'Explain in Plain English' Questions: Implications for Teaching. In Proceedings of the 43rd ACM Technical Symposium on Computer Science Education (SIGCSE '12). ACM, New York, NY, USA, 385--390. https://doi.org/10.1145/2157136.2157249
[28]
Ifeanyi G. Ndukwe, Ben K. Daniel, and Chukwudi E. Amadi. 2019. A Machine Learning Grading System Using Chatbots. In Artificial Intelligence in Education (Lecture Notes in Computer Science ). Springer International Publishing, Cham, 365--368.
[29]
Bertram Opitz, Nicola K Ferdinand, and Axel Mecklinger. 2011. Timing matters: the impact of immediate and delayed feedback on artificial language learning. Frontiers in human neuroscience, Vol. 5 (2011), 8.
[30]
Alexander Renkl, Robin Stark, Hans Gruber, and Heinz Mandl. 1998. Learning from Worked-Out Examples: The Effects of Example Variability and Elicited Self-Explanations. Contemporary educational psychology, Vol. 23 (01 1998), 90--108. https://doi.org/10.1006/ceps.1997.0959
[31]
Robert S Rist. 1989. Schema creation in programming. Cognitive Science, Vol. 13, 3 (1989), 389--414.
[32]
Swarnadeep Saha, Tejas I. Dhamecha, Smit Marvaniya, Renuka Sindhgatta, and Bikram Sengupta. 2018. Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both. In Artificial Intelligence in Education (Lecture Notes in Computer Science ). Springer International Publishing, Cham, 503--517.
[33]
Chul Sung, Tejas Indulal Dhamecha, and Nirmal Mukhi. 2019. Improving Short Answer Grading Using Transformer-Based Pre-training. In Artificial Intelligence in Education . Vol. 11625. Springer International Publishing, Cham, 469--481.
[34]
Neslihan Suzen, Alexander Gorban, Jeremy Levesley, and Evgeny Mirkes. 2019. Automatic Short Answer Grading and Feedback Using Text Mining Methods . CoRR (2019). http://arxiv.org/abs/1807.10543 arXiv: 1807.10543.
[35]
John Sweller. 2011. Cognitive Load Theory . In Psychology of learning and motivation . Vol. 55. Elsevier, 37--76.
[36]
John Sweller, Jeroen JG Van Merrienboer, and Fred GWC Paas. 1998. Cognitive architecture and instructional design. Educational psychology review, Vol. 10, 3 (1998), 251--296.
[37]
Anne Venables, Grace Tan, and Raymond Lister. 2009. A Closer Look at Tracing, Explaining and Code Writing Skills in the Novice Programmer. In Proceedings of the Fifth International workshop on Computing Education Research. ACM, 117--128.
[38]
Anthony J Viera, Joanne M Garrett, et almbox. 2005. Understanding interobserver agreement: the kappa statistic. Fam med, Vol. 37, 5 (2005), 360--363.
[39]
Jacqueline Whalley, Raymond Lister, Errol Thompson, Tony Clear, Phil Robbins, P K Ajith Kumar, and Christine Prasad. 2006. An Australasian study of Reading and Comprehension Skills in Novice Programmers, using the Bloom and SOLO Taxonomies . Eighth Australasian Computing Education Conference (ACE2006) (2006).
[40]
Susan Wiedenbeck. 1985. Novice/expert differences in programming skills. International Journal of Man-Machine Studies, Vol. 23, 4 (1985), 383 -- 390. https://doi.org/10.1016/S0020--7373(85)80041--9
[41]
Leon E. Winslow. 1996. Programming Pedagogy-- a Psychological Overview. SIGCSE Bull., Vol. 28, 3 (Sept. 1996), 17--22. https://doi.org/10.1145/234867.234872
[42]
Benjamin Xie, Dastyni Loksa, Greg L Nelson, Matthew J Davidson, Dongsheng Dong, Harrison Kwik, Alex Hui Tan, Leanne Hwa, Min Li, and Andrew J Ko. 2019. A theory of instruction for introductory programming skills. Computer Science Education, Vol. 29, 2--3 (2019), 205--253.
[43]
Xi Yang, Yuwei Huang, Fuzhen Zhuang, Lishan Zhang, and Shengquan Yu. 2018. Automatic Chinese Short Answer Grading with Deep Autoencoder. In Artificial Intelligence in Education (Lecture Notes in Computer Science ). Springer International Publishing, Cham, 399--404.
[44]
Craig Zilles, Matthew West, Geoffrey Herman, and Timothy Bretl. 2019. Every university should have a computer-based testing facility. In Proceedings of the 11th International Conference on Computer Supported Education (CSEDU) .

Cited By

View all
  • (2024)Non-Expert Programmers in the Generative AI FutureProceedings of the 3rd Annual Meeting of the Symposium on Human-Computer Interaction for Work10.1145/3663384.3663393(1-19)Online publication date: 25-Jun-2024
  • (2024)Prompting for Comprehension: Exploring the Intersection of Explain in Plain English Questions and Prompt WritingProceedings of the Eleventh ACM Conference on Learning @ Scale10.1145/3657604.3662039(39-50)Online publication date: 9-Jul-2024
  • (2024)Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting SkillsProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 110.1145/3649217.3653587(283-289)Online publication date: 3-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education
March 2021
1454 pages
ISBN:9781450380621
DOI:10.1145/3408877
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. asag
  2. code reading
  3. cs1
  4. explain in plain english
  5. nlp

Qualifiers

  • Article

Funding Sources

Conference

SIGCSE '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,595 of 4,542 submissions, 35%

Upcoming Conference

SIGCSE TS 2025
The 56th ACM Technical Symposium on Computer Science Education
February 26 - March 1, 2025
Pittsburgh , PA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)252
  • Downloads (Last 6 weeks)30
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Non-Expert Programmers in the Generative AI FutureProceedings of the 3rd Annual Meeting of the Symposium on Human-Computer Interaction for Work10.1145/3663384.3663393(1-19)Online publication date: 25-Jun-2024
  • (2024)Prompting for Comprehension: Exploring the Intersection of Explain in Plain English Questions and Prompt WritingProceedings of the Eleventh ACM Conference on Learning @ Scale10.1145/3657604.3662039(39-50)Online publication date: 9-Jul-2024
  • (2024)Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting SkillsProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 110.1145/3649217.3653587(283-289)Online publication date: 3-Jul-2024
  • (2024)Code Generation Based Grading: Evaluating an Auto-grading Mechanism for "Explain-in-Plain-English" QuestionsProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 110.1145/3649217.3653582(171-177)Online publication date: 3-Jul-2024
  • (2024)Integrating Natural Language Prompting Tasks in Introductory Programming CoursesProceedings of the 2024 on ACM Virtual Global Computing Education Conference V. 110.1145/3649165.3690125(88-94)Online publication date: 5-Dec-2024
  • (2024)Evaluating Large Language Model Code Generation as an Autograding Mechanism for "Explain in Plain English" QuestionsProceedings of the 55th ACM Technical Symposium on Computer Science Education V. 210.1145/3626253.3635542(1824-1825)Online publication date: 14-Mar-2024
  • (2024)How Beginning Programmers and Code LLMs (Mis)read Each OtherProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642706(1-26)Online publication date: 11-May-2024
  • (2024)Exploring The Effectiveness of Reading vs. Tutoring For Enhancing Code Comprehension For NovicesProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636007(38-47)Online publication date: 8-Apr-2024
  • (2024)Design of an Auto Evaluation Model for Subjective Answers Using Natural Language Processing and Machine Learning TechniquesProceedings of 4th International Conference on Artificial Intelligence and Smart Energy10.1007/978-3-031-61471-2_14(200-209)Online publication date: 12-Jun-2024
  • (2023)Evaluation of Submission Limits and Regression Penalties to Improve Student Behavior with Automatic Assessment SystemsACM Transactions on Computing Education10.1145/359121023:3(1-24)Online publication date: 20-Jun-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media