Predictive Models as Early Warning Systems: A Bayesian Classification Model to Identify At-Risk Students of Programming

Ashok Kumar Veerasamy¹⁰,
Mikko-Jussi Laakso¹⁰,
Daryl D’Souza¹¹ &
…
Tapio Salakoski¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 284))

2355 Accesses

Abstract

The pursuit of a deeper understanding of factors that influence student performance outcomes has long been of interest to the computing education community. Among these include the development of effective predictive models to predict student academic performance. Predictive models may serve as early warning systems to identify students at risk of failing or quitting early. This paper presents a class of machine learning predictive models based on Naive Bayes classification, to predict student performance in introductory programming. The models use formative assessment tasks and self-reported cognitive features such as prior programming knowledge and problem-solving skills. Our analysis revealed that the use of just three variables was a good fit for the models employed. The models that used in-class assessment and cognitive features as predictors returned best at-risk prediction accuracies, compared with models that used take-home assessment and cognitive features as predictors. The prediction accuracy in identifying at-risk students on unknown data for the course was 71% (overall prediction accuracy) in compliance with the area under the curve (ROC) score (0.66). Based on these results we present a generic predictive model and its potential application as an early warning system for early identification of students at risk.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 143.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 179.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Interpretable Methods for Early Prediction of Student Performance in Programming Courses

Predicting Student Performance from Multiple Data Sources

Unleashing the Power of Predictive Analytics to Identify At-Risk Students in Computer Science

Article 17 July 2023

References

Ali, A., Smith, D.: Teaching an introductory programming language. J. Inf. Technol. Educ.: Innov. Pract. 13, 57–67 (2014)
Google Scholar
Holvikivi, J.: Conditions for successful learning of programming skills. In: Reynolds, N., Turcsányi-Szabó, M. (eds.) KCKS 2010. IAICT, vol. 324, pp. 155–164. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15378-5_15
Chapter Google Scholar
Watson, C., Li, F.W.B. Failure Rates in introductory programming revisited. In Proceedings of the 2014 Conference on Innovation & Technology in Computer Science Education (Uppsala 2014), pp. 39–44. Association of Computing Machinery (2014)
Google Scholar
Bennedsen, J., Caspersen, M.: Failure rates in introductory programming: 12 years later. ACM Inroads 10(2), 30–36 (2019)
Article Google Scholar
Castro-Wunsch, K., Ahadi, A., Petersen, A.: Evaluating neural networks as a method for identifying students in need of assistance. In: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education (Seattle 2017), pp. 111–116. ACM (2017)
Google Scholar
Conijn, R., Snijders, C., Kleingeld, A., Matzat, U.: Predicting student performance from LMS data: a comparison of 17 blended courses using moodle LMS. IEEE Trans. Learn. Technol. 10(1), 17–29 (2017)
Article Google Scholar
Liao, S.N., Zingaro, D., Alvarado, C., Griswold, W.G., Porter, L.: Exploring the value of different data sources for predicting student performance in multiple CS courses. In: Proceedings of the 50th ACM Technical Symposium on Computer Science Education, Minneapolis, MN, USA, pp. 112–118. ACM (2019)
Google Scholar
Pawlowska, D.K., Westerman, J.W., Bergman, S.M., Huelsman, T.J.: Student personality, classroom environment, and student outcomes: a person–environment fit analysis. Learn. Individ. Differ. 36, 180–193 (2014)
Article Google Scholar
Costa, E.B., Fonseca, B., Santana, M.A., Araújo, F.F., Rego, J.: Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Comput. Hum. Behav. 73, 247–256 (2017)
Article Google Scholar
Roberts, S.A.: Parsimonious modelling and forecasting of seasonal time series. Eur. J. Oper. Res. 16, 365–377 (1984)
Article Google Scholar
Vandekerckhove, J., Matzke, D., Wagenmakers, E.-J.: Model comparison and the principle of parsimony. In: Busemeyer, J.R., et al. (eds.) The Oxford Handbook of Computational and Mathematical Psychology. Oxford University Press, New York (2015)
Google Scholar
D'zurilla, T.J., Nezu, A.M., Maydeu-Olivares, A.: Social problem solving: theory and assessment. In: Chang, E.C., et al. (eds.) Social problem Solving: Theory, Research, and Training. American Psychological Association, Washington, DC (2004)
Google Scholar
White, H.B., Benore, M.A., Sumter, T.F., Caldwell, B.D., Bell, E.: What skills should students of undergraduate biochemistry and molecular biology programs have upon graduation? Biochem. Mol. Biol. Educ. 41(5), 297–301 (2013)
Google Scholar
Kappelman, L., Jones, M.C., Johnson, V., McLean, E.R., Boonme, K.: Skills for success at different stages of an IT professional’s career. Commun. ACM 59(8), 64–70 (2016)
Article Google Scholar
Heppner, P.P., Krauskopf, C.J.: An information-processing approach to personal problem solving. Counsell. Psychol. 15(3), 371–447 (1987)
Article Google Scholar
Adachi, P.J.C., Willoughby, T.: More than just fun and games: the longitudinal relationships between strategic video games, self-reported problem solving skills, and academic grades. J. Youth Adolesc. 42(7), 1041–1052 (2013)
Article Google Scholar
Bester, L.: Investigating the problem-solving proficiency of second-year quantitative techniques students: the case of Walter Sisulu University. University of South Africa, Pretoria (2014)
Google Scholar
Marion, B., Impagliazzo, J., St. Clair, C., Soroka, B., Whitfield, D.: Assessing computer science programs: what have we learned. In: SIGCSE 2007 Proceedings of the 38th SIGCSE Technical Symposium on Computer Science Education, Covington, Kentucky, USA, pp. 131–132. ACM (2007)
Google Scholar
Ring, B.A., Giordan, J., Ransbottom, J.S.: Problem solving through programming: motivating the non-programmer. J. Comput. Sci. Coll. 23(3), 61–67 (2008)
Google Scholar
Uysal, M.P.: Improving first computer programming experiences: the case of adapting a web-supported and well- structured problem-solving method to a traditional course. Contemp. Educ. Technol. 5(3), 198–217 (2014)
Article Google Scholar
Svinicki, M.: What they don’t know can hurt them: the role of prior knowledge in learning. POD network, Nederland, Colorado (1993)
Google Scholar
Hailikari, T.: Assessing university students’ prior knowledge implications for theory and practice. Helsinki (2009)
Google Scholar
Watson, C., Li, F.W.B., Godwin, JL.: No tests required: comparing traditional and dynamic predictors of programming success. In: Proceedings of the 45th ACM Technical Symposium on Computer Science Education, pp. 469–474. ACM (2014)
Google Scholar
Longi, K.: Exploring factors that affect performance on introductory programming courses. University of Helsinki, Helsinki (2016)
Google Scholar
Hsu, W.C., Plunkett, S.W.: Attendance and grades in learning programming classes. In: Proceedings of the Australasian Computer Science Week Multi Conference, Canberra (2016)
Google Scholar
Sabin, M., Alrumaih, H., Impagliazzo, J., Lunt, B., Zhang, M.: Information technology curricula 2017: curriculum guidelines for baccalaureate degree programs in information technology. ACM and IEEE, New York (2017)
Google Scholar
Bloom Benjamin, S., Hastings, J.T., Madaus, G.F.: Handbook on Formative and Summative Evaluation of Student Learning. McGraw-Hill Book Company, New York (1971)
Google Scholar
Lau, A.M.S.: ‘Formative good, summative bad?’ – a review of the dichotomy in assessment literature. J. Furth. High. Educ. 40(16), 509–525 (2016)
Article Google Scholar
VanDeGrift, T.: Supporting creativity and user interaction in CS 1 homework assignments. In: 46th ACM Technical Symposium on Computer Science Education, Kansas City, pp. 54–59. ACM (2015)
Google Scholar
Rajoo, M., Veloo, A.: The relationship between mathematics homework engagement and mathematics achievement. Aust. J. Basic Appl. Sci. 9(28), 136–144 (2015)
Google Scholar
Veerasamy, A.K., D’Souza, D., Lindén, R., Kaila, E., Laakso, M.-J., Salakoski, T.: The impact of lecture attendance on exams for novice programming students. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 8(5), 1–11 (2016)
Article Google Scholar
Fan, H., Xu, J., Cai, Z., He, J., Fan, X.: Homework and students’ achievement in math and science: A 30-year meta-analysis, 1986–2015. Educ. Res. Rev. 20, 35–54 (2017)
Article Google Scholar
Fujinuma, R., Wendling, L.: Repeating knowledge application practice to improve student performance in a large, introductory science course. Int. J. Sci. Educ. 37(17), 2906–2922 (2015)
Article Google Scholar
Thong, L.W., Ng, P.K., Ong, P.T., Sun, C.C.: Performance analysis of students learning through computer-assisted tutorials and item analysis feedback learning (CATIAF) in foundation mathematics. Herald NAMSCA, vol. 1, p. 1 (2018)
Google Scholar
Ahadi, A., Lister, R., Haapala, H., Vihavainen, A.: Exploring machine learning methods to automatically identify students in need of assistance. In: Proceedings of the Eleventh Annual International Conference on International Computing Education Research, Omaha, Nebraska, USA, pp. 121–130. ACM (2015)
Google Scholar
Porter, L., Zingaro, D., Lister, R.: Predicting student success using fine grain clicker data. In: Proceedings of the Tenth Annual Conference on International Computing Education Research, Glasgow, Scotland, United Kingdom, pp. 51–58. ACM (2014)
Google Scholar
Quille, K., Bergin, S.: Programming: predicting student success early in CS1. A re-validation and replication study. In: Proceedings of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, Larnaca, Cyprus, pp. 15–20. ACM (2018)
Google Scholar
Liao, S., Zingaro, D., Thai, K., Alvarado, C., Griswold, W., Porter, L.: A robust machine learning technique to predict low-performing students. ACM Trans. Comput. Educ. 19(3), 18:1–18:19 (2019)
Article Google Scholar
Liao, S.N., Zingaro, D., Laurenzano, M.A., Griswold, W.G., Porter, L.: Lightweight, early identification of at-risk CS1 students. In: Proceedings of the 2016 ACM Conference on International Computing Education Research, Melbourne, VIC, Australia, pp. 123–131. ACM (2016)
Google Scholar
Hamoud, A.K., Humadi, A.M., Awadh, W.A., Hashim, A.S.: Students’ success prediction based on Bayes algorithms. Int. J. Comput. Appl. 178(7), 6–12 (2017)
Google Scholar
Devasia, T., Vinushree, T.P., Hegde, V.: Prediction of students performance using educational data mining. In: 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), Ernakulam, pp. 91–95. IEEE (2016)
Google Scholar
Agrawal, H., Mavani, H.: Student performance prediction using machine learning. Int. J. Eng. Res. Technol. (IJERT) 4(3), 111–113 (2015)
Google Scholar
Bergin, S., Mooney, A., Ghent, J., Quille, K.: Using machine learning techniques to predict introductory programming performance. Int. J. Comput. Sci. Softw. Eng. 4(12), 323–328 (2015)
Google Scholar
Borra, S., Di Ciaccio, A.: Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods. Comput. Stat. Data Anal. 54, 2976–2989 (2010)
Article MathSciNet Google Scholar
Macfadyen, L.P., Dawson, S.: Mining LMS data to develop an “early warning system” for educators: a proof of concept. Comput. Educ. 54, 588–599 (2009)
Article Google Scholar
Krumm, A., Joseph Waddington, R., Teasley, S., Lonn, S.: A learning management system-based early warning system for academic advising in undergraduate engineering. In: Larusson, J.A., White, B. (eds.) Learning Analytics, pp. 103–119. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-3305-7_6
Chapter Google Scholar
Arnold, K.E., Pistilli, M.D.: Course signals at Purdue: using learning analytics to increase student success. In: Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, Vancouver, British Columbia, Canada, pp. 267–270. ACM (2012)
Google Scholar
Pistilli, M., Willis, J., Campbell, J.: Analytics through an institutional lens: definition, theory, design, and impact. In: Larusson, J.A., White, B. (eds.) Learning Analytics, pp. 79–102. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-3305-7_5
Chapter Google Scholar
Ya-Han, H., Lo, C-.L., Shih, S-.P.: Developing early warning systems to predict students’ online learning performance. Comput. Hum. Behav. 36, 469–478 (2014)
Article Google Scholar
Pedraza, D.A.: The relationship between course assignments and academic performance: an analysis of predictive characteristics of student performance. Texas Tech University (2018)
Google Scholar
Marbouti, F., Diefes-Dux, H.A., Madhavan, K.: Models for early prediction of at-risk students in a course using standards-based grading. Comput. Educ. 103, 1–15 (2016)
Article Google Scholar
Heppner, P.P., Petersen, C.H.: The development and implications of a personal problem-solving inventory. J. Counsell. Psychol. 29(1), 66–75 (1982)
Article Google Scholar
Veerasamy, A.K., D’Souza, D., Lindén, R., Laakso, M.-J.: Relationship between perceived problem-solving skills and academic performance of novice learners in introductory programming courses. J. Comput. Assist. Learn. 35(2), 246–255 (2019)
Article Google Scholar
Veerasamy, A.K., D’Souza, D., Linden, R., Laakso, M.-J.: The impact of prior programming knowledge on lecture attendance and final exam. J. Educ. Comput. Res. 56(2), 226–253 (2018)
Article Google Scholar
Özyurt, Ö.: Examining the critical thinking dispositions and the problem solving skills of computer engineering students. Eurasia J. Math. 11, 2 (2015)
Google Scholar
Chakrabarty, S., Martin, F.: Role of prior experience on student performance in the introductory undergraduate CS course. In: SIGCSE 2018 Proceedings of the 49th ACM Technical Symposium on Computer Science Education, Baltimore, Maryland, USA, pp. 1075–1075. ACM (2018)
Google Scholar
Austin, P., Steyerberg, E.: The number of subjects per variable required in linear regression analyses. J. Clin. Epidemiol. 68(6), 627–636 (2015)
Article Google Scholar

Download references

Acknowledgments

The authors wish to thank all members of ViLLE research team group and Department of Future technologies, University of Turku, for their comments and support that greatly improved the manuscript. This research was supported fully by a University of Turku, Turku, Finland.

Author information

Authors and Affiliations

University of Turku, Turku, Finland
Ashok Kumar Veerasamy, Mikko-Jussi Laakso & Tapio Salakoski
RMIT University, Melbourne, Australia
Daryl D’Souza

Authors

Ashok Kumar Veerasamy
View author publications
You can also search for this author in PubMed Google Scholar
Mikko-Jussi Laakso
View author publications
You can also search for this author in PubMed Google Scholar
Daryl D’Souza
View author publications
You can also search for this author in PubMed Google Scholar
Tapio Salakoski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashok Kumar Veerasamy .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Veerasamy, A.K., Laakso, MJ., D’Souza, D., Salakoski, T. (2021). Predictive Models as Early Warning Systems: A Bayesian Classification Model to Identify At-Risk Students of Programming. In: Arai, K. (eds) Intelligent Computing. Lecture Notes in Networks and Systems, vol 284. Springer, Cham. https://doi.org/10.1007/978-3-030-80126-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-80126-7_14
Published: 07 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80125-0
Online ISBN: 978-3-030-80126-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics