Abstract
The pursuit of a deeper understanding of factors that influence student performance outcomes has long been of interest to the computing education community. Among these include the development of effective predictive models to predict student academic performance. Predictive models may serve as early warning systems to identify students at risk of failing or quitting early. This paper presents a class of machine learning predictive models based on Naive Bayes classification, to predict student performance in introductory programming. The models use formative assessment tasks and self-reported cognitive features such as prior programming knowledge and problem-solving skills. Our analysis revealed that the use of just three variables was a good fit for the models employed. The models that used in-class assessment and cognitive features as predictors returned best at-risk prediction accuracies, compared with models that used take-home assessment and cognitive features as predictors. The prediction accuracy in identifying at-risk students on unknown data for the course was 71% (overall prediction accuracy) in compliance with the area under the curve (ROC) score (0.66). Based on these results we present a generic predictive model and its potential application as an early warning system for early identification of students at risk.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ali, A., Smith, D.: Teaching an introductory programming language. J. Inf. Technol. Educ.: Innov. Pract. 13, 57–67 (2014)
Holvikivi, J.: Conditions for successful learning of programming skills. In: Reynolds, N., Turcsányi-Szabó, M. (eds.) KCKS 2010. IAICT, vol. 324, pp. 155–164. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15378-5_15
Watson, C., Li, F.W.B. Failure Rates in introductory programming revisited. In Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education (Uppsala 2014), pp. 39–44. Association of Computing Machinery (2014)
Bennedsen, J., Caspersen, M.: Failure rates in introductory programming: 12 years later. ACM Inroads 10(2), 30–36 (2019)
Castro-Wunsch, K., Ahadi, A., Petersen, A.: Evaluating neural networks as a method for identifying students in need of assistance. In: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (Seattle 2017), pp. 111–116. ACM (2017)
Conijn, R., Snijders, C., Kleingeld, A., Matzat, U.: Predicting student performance from LMS data: a comparison of 17 blended courses using moodle LMS. IEEE Trans. Learn. Technol. 10(1), 17–29 (2017)
Liao, S.N., Zingaro, D., Alvarado, C., Griswold, W.G., Porter, L.: Exploring the value of different data sources for predicting student performance in multiple CS courses. In: Proceedings of the 50th ACM Technical Symposium on Computer Science Education, Minneapolis, MN, USA, pp. 112–118. ACM (2019)
Pawlowska, D.K., Westerman, J.W., Bergman, S.M., Huelsman, T.J.: Student personality, classroom environment, and student outcomes: a person–environment fit analysis. Learn. Individ. Differ. 36, 180–193 (2014)
Costa, E.B., Fonseca, B., Santana, M.A., Araújo, F.F., Rego, J.: Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Comput. Hum. Behav. 73, 247–256 (2017)
Roberts, S.A.: Parsimonious modelling and forecasting of seasonal time series. Eur. J. Oper. Res. 16, 365–377 (1984)
Vandekerckhove, J., Matzke, D., Wagenmakers, E.-J.: Model comparison and the principle of parsimony. In: Busemeyer, J.R., et al. (eds.) The Oxford Handbook of Computational and Mathematical Psychology. Oxford University Press, New York (2015)
D'zurilla, T.J., Nezu, A.M., Maydeu-Olivares, A.: Social problem solving: theory and assessment. In: Chang, E.C., et al. (eds.) Social problem Solving: Theory, Research, and Training. American Psychological Association, Washington, DC (2004)
White, H.B., Benore, M.A., Sumter, T.F., Caldwell, B.D., Bell, E.: What skills should students of undergraduate biochemistry and molecular biology programs have upon graduation? Biochem. Mol. Biol. Educ. 41(5), 297–301 (2013)
Kappelman, L., Jones, M.C., Johnson, V., McLean, E.R., Boonme, K.: Skills for success at different stages of an IT professional’s career. Commun. ACM 59(8), 64–70 (2016)
Heppner, P.P., Krauskopf, C.J.: An information-processing approach to personal problem solving. Counsell. Psychol. 15(3), 371–447 (1987)
Adachi, P.J.C., Willoughby, T.: More than just fun and games: the longitudinal relationships between strategic video games, self-reported problem solving skills, and academic grades. J. Youth Adolesc. 42(7), 1041–1052 (2013)
Bester, L.: Investigating the problem-solving proficiency of second-year quantitative techniques students: the case of Walter Sisulu University. University of South Africa, Pretoria (2014)
Marion, B., Impagliazzo, J., St. Clair, C., Soroka, B., Whitfield, D.: Assessing computer science programs: what have we learned. In: SIGCSE 2007 Proceedings of the 38th SIGCSE Technical Symposium on Computer Science Education, Covington, Kentucky, USA, pp. 131–132. ACM (2007)
Ring, B.A., Giordan, J., Ransbottom, J.S.: Problem solving through programming: motivating the non-programmer. J. Comput. Sci. Coll. 23(3), 61–67 (2008)
Uysal, M.P.: Improving first computer programming experiences: the case of adapting a web-supported and well- structured problem-solving method to a traditional course. Contemp. Educ. Technol. 5(3), 198–217 (2014)
Svinicki, M.: What they don’t know can hurt them: the role of prior knowledge in learning. POD network, Nederland, Colorado (1993)
Hailikari, T.: Assessing university students’ prior knowledge implications for theory and practice. Helsinki (2009)
Watson, C., Li, F.W.B., Godwin, JL.: No tests required: comparing traditional and dynamic predictors of programming success. In: Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pp. 469–474. ACM (2014)
Longi, K.: Exploring factors that affect performance on introductory programming courses. University of Helsinki, Helsinki (2016)
Hsu, W.C., Plunkett, S.W.: Attendance and grades in learning programming classes. In: Proceedings of the Australasian Computer Science Week Multi Conference, Canberra (2016)
Sabin, M., Alrumaih, H., Impagliazzo, J., Lunt, B., Zhang, M.: Information technology curricula 2017: curriculum guidelines for baccalaureate degree programs in information technology. ACM and IEEE, New York (2017)
Bloom Benjamin, S., Hastings, J.T., Madaus, G.F.: Handbook on Formative and Summative Evaluation of Student Learning. McGraw-Hill Book Company, New York (1971)
Lau, A.M.S.: ‘Formative good, summative bad?’ – a review of the dichotomy in assessment literature. J. Furth. High. Educ. 40(16), 509–525 (2016)
VanDeGrift, T.: Supporting creativity and user interaction in CS 1 homework assignments. In: 46th ACM Technical Symposium on Computer Science Education, Kansas City, pp. 54–59. ACM (2015)
Rajoo, M., Veloo, A.: The relationship between mathematics homework engagement and mathematics achievement. Aust. J. Basic Appl. Sci. 9(28), 136–144 (2015)
Veerasamy, A.K., D’Souza, D., Lindén, R., Kaila, E., Laakso, M.-J., Salakoski, T.: The impact of lecture attendance on exams for novice programming students. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 8(5), 1–11 (2016)
Fan, H., Xu, J., Cai, Z., He, J., Fan, X.: Homework and students’ achievement in math and science: A 30-year meta-analysis, 1986–2015. Educ. Res. Rev. 20, 35–54 (2017)
Fujinuma, R., Wendling, L.: Repeating knowledge application practice to improve student performance in a large, introductory science course. Int. J. Sci. Educ. 37(17), 2906–2922 (2015)
Thong, L.W., Ng, P.K., Ong, P.T., Sun, C.C.: Performance analysis of students learning through computer-assisted tutorials and item analysis feedback learning (CATIAF) in foundation mathematics. Herald NAMSCA, vol. 1, p. 1 (2018)
Ahadi, A., Lister, R., Haapala, H., Vihavainen, A.: Exploring machine learning methods to automatically identify students in need of assistance. In: Proceedings of the Eleventh Annual International Conference on International Computing Education Research, Omaha, Nebraska, USA, pp. 121–130. ACM (2015)
Porter, L., Zingaro, D., Lister, R.: Predicting student success using fine grain clicker data. In: Proceedings of the Tenth Annual Conference on International Computing Education Research, Glasgow, Scotland, United Kingdom, pp. 51–58. ACM (2014)
Quille, K., Bergin, S.: Programming: predicting student success early in CS1. A re-validation and replication study. In: Proceedings of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, Larnaca, Cyprus, pp. 15–20. ACM (2018)
Liao, S., Zingaro, D., Thai, K., Alvarado, C., Griswold, W., Porter, L.: A robust machine learning technique to predict low-performing students. ACM Trans. Comput. Educ. 19(3), 18:1–18:19 (2019)
Liao, S.N., Zingaro, D., Laurenzano, M.A., Griswold, W.G., Porter, L.: Lightweight, early identification of at-risk CS1 students. In: Proceedings of the 2016 ACM Conference on International Computing Education Research, Melbourne, VIC, Australia, pp. 123–131. ACM (2016)
Hamoud, A.K., Humadi, A.M., Awadh, W.A., Hashim, A.S.: Students’ success prediction based on Bayes algorithms. Int. J. Comput. Appl. 178(7), 6–12 (2017)
Devasia, T., Vinushree, T.P., Hegde, V.: Prediction of students performance using educational data mining. In: 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), Ernakulam, pp. 91–95. IEEE (2016)
Agrawal, H., Mavani, H.: Student performance prediction using machine learning. Int. J. Eng. Res. Technol. (IJERT) 4(3), 111–113 (2015)
Bergin, S., Mooney, A., Ghent, J., Quille, K.: Using machine learning techniques to predict introductory programming performance. Int. J. Comput. Sci. Softw. Eng. 4(12), 323–328 (2015)
Borra, S., Di Ciaccio, A.: Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods. Comput. Stat. Data Anal. 54, 2976–2989 (2010)
Macfadyen, L.P., Dawson, S.: Mining LMS data to develop an “early warning system” for educators: a proof of concept. Comput. Educ. 54, 588–599 (2009)
Krumm, A., Joseph Waddington, R., Teasley, S., Lonn, S.: A learning management system-based early warning system for academic advising in undergraduate engineering. In: Larusson, J.A., White, B. (eds.) Learning Analytics, pp. 103–119. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-3305-7_6
Arnold, K.E., Pistilli, M.D.: Course signals at Purdue: using learning analytics to increase student success. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, Vancouver, British Columbia, Canada, pp. 267–270. ACM (2012)
Pistilli, M., Willis, J., Campbell, J.: Analytics through an institutional lens: definition, theory, design, and impact. In: Larusson, J.A., White, B. (eds.) Learning Analytics, pp. 79–102. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-3305-7_5
Ya-Han, H., Lo, C-.L., Shih, S-.P.: Developing early warning systems to predict students’ online learning performance. Comput. Hum. Behav. 36, 469–478 (2014)
Pedraza, D.A.: The relationship between course assignments and academic performance: an analysis of predictive characteristics of student performance. Texas Tech University (2018)
Marbouti, F., Diefes-Dux, H.A., Madhavan, K.: Models for early prediction of at-risk students in a course using standards-based grading. Comput. Educ. 103, 1–15 (2016)
Heppner, P.P., Petersen, C.H.: The development and implications of a personal problem-solving inventory. J. Counsell. Psychol. 29(1), 66–75 (1982)
Veerasamy, A.K., D’Souza, D., Lindén, R., Laakso, M.-J.: Relationship between perceived problem-solving skills and academic performance of novice learners in introductory programming courses. J. Comput. Assist. Learn. 35(2), 246–255 (2019)
Veerasamy, A.K., D’Souza, D., Linden, R., Laakso, M.-J.: The impact of prior programming knowledge on lecture attendance and final exam. J. Educ. Comput. Res. 56(2), 226–253 (2018)
Özyurt, Ö.: Examining the critical thinking dispositions and the problem solving skills of computer engineering students. Eurasia J. Math. 11, 2 (2015)
Chakrabarty, S., Martin, F.: Role of prior experience on student performance in the introductory undergraduate CS course. In: SIGCSE 2018 Proceedings of the 49th ACM Technical Symposium on Computer Science Education, Baltimore, Maryland, USA, pp. 1075–1075. ACM (2018)
Austin, P., Steyerberg, E.: The number of subjects per variable required in linear regression analyses. J. Clin. Epidemiol. 68(6), 627–636 (2015)
Acknowledgments
The authors wish to thank all members of ViLLE research team group and Department of Future technologies, University of Turku, for their comments and support that greatly improved the manuscript. This research was supported fully by a University of Turku, Turku, Finland.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Veerasamy, A.K., Laakso, MJ., D’Souza, D., Salakoski, T. (2021). Predictive Models as Early Warning Systems: A Bayesian Classification Model to Identify At-Risk Students of Programming. In: Arai, K. (eds) Intelligent Computing. Lecture Notes in Networks and Systems, vol 284. Springer, Cham. https://doi.org/10.1007/978-3-030-80126-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-80126-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80125-0
Online ISBN: 978-3-030-80126-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)