Abstract
Optimizing the number and utility of features to use in a classification analysis has been the subject of many research studies. Most current models use end-classifications as part of the feature reduction process, leading to circularity in the methodology. The approach demonstrated in the present research uses item response theory (IRT) to select features independent of the end-classification results without the biased accuracies that this circularity engenders. Dichotomous and polytomous IRT models were used to analyze 30 histological breast cancer features from 569 patients using the Wisconsin Diagnostic Breast Cancer data set. Based on their characteristics, three features were selected for use in a machine learning classifier. For comparison purposes, two machine learning–based feature selection protocols were run—recursive feature elimination (RFE) and ridge regression—and the three features selected from these analyses were also used in the subsequent learning classifier. Classification results demonstrated that all three selection processes performed comparably. The non-biased nature of the IRT protocol and information provided about the specific characteristics of the features as to why they are of use in classification help to shed light on understanding which attributes of features make them suitable for use in a machine learning context.
Graphical abstract
Similar content being viewed by others
References
Y. Bergner, S. Droschler, G. Kortemeyer, S. Rayyan, D. Seaton, and D.E. Pritchard. Model-based collaborative filtering analysis of student response date: machine learning item response theory. In Proceedings of the 5th International Conference on Educational Data Mining, 95-102. Chinia, Greece, June 19-21, 2012.
J. Brownee. A gentle introduction to the rectified linear unit (relu). In Better Deep Learning, Retrieved from: https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks. January 9, 2019.
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Chandrashekar F, Sahin G (2014) A survey on feature selection methods. Computers and Electrical Engineering 40:16–28
De Vlaming R, Groenen PJF (2015) The current and future use of ridge regression for prediction in quantitative genetics. BioMed Research International:1–18
Deo RC (2015) Machine learning in medicine. Circulation 132(20):1920–1930
D. Dua and C Graff. In UCI machine learning repository. irvine, California: University of California, School of Information and Computer Science [http://archive.ics.uci.edu/ml], 2019.
S.E. Embretson and S.P. Reise. Item response theory for psychologists. 2000.
M.A. Hall and L.A. Smith. Practical feature subset for machine learning. In C.McDonald (Ed.), Computer Science ‘98 Proceedings of the 21st Australian Computer Science Conference, 181-191. Springer, Perth, 1998.
K.T. Han. Parscale. In BB. (Ed.) Frey, editor, The SAGE encyclopedia of educational research, measurement, and evaluation, 1208-1210. Sage, Thousand Oaks, 2018.
Handelman GS, Kok HK, Chandra RV, Razavi AH, Huang S, Brooks M, Lee MJ, Asadi H (2019) Peering into the black box of artificial intelligence: evaluating metrics of machine learning methods. American Journal of Roentgenology 212(1):38–43
A. Jovic, J. Prados, and M. Hilario. A review of feature selection methods with applications. In 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pages 1200-1205. IEEE, doi: https://doi.org/10.1109/MIPRO.2015.7160458, Opatija, Croatia, 2015.
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge and Information Systems 12(1):95–116
D.P. Kingma and J.L. Ba. Adam: A method for stochastic optimization. In 3rd International conference on Learning Representations (ICLR). San Diego, California, May 7-9, 2015. Retrieved from https://arxiv.org/pdf/1412.6980.pdf.
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artificial Intelligence 97(1-2):273–324
Krawczuk J, Lukaszuk T (2016) The feature selection bias problem in relation to high dimensional gene data. Artificial Intelligence in Medicine 66:63–71
J.P. Lalor, H. Wu, and H. Yu. Building an evaluation scale using item response theory. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 648-657. Austin, Texas, November 1-5, 2016.
Lee CH, Yoon HJ (2017) Medical big data: promise and challenges. Kidney Research and Clinical Practice 36(1):3–11
H. Liu and R. Setiono. A probabilistic approach to feature selection - a filter solution. In Proceedings of 9th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, 319-327, Fukuoka, Japan, 6 1996.
Liu C, Wang W, Zhao Q, Shen X, Konan M (2017) A new feature selection method based on a validity index of feature subset. Pattern Recognition Letters 92:1–8
Mangasarian OL, Street WN, Wolberg WH (1995) Breast cancer diagnosis and prognosis via linear programming. Operations Research 43(4):570–577
Marinez-Plumed F, Pudencio RBC, Martinez-Uso J, Hernandez-Orallo A (2019) Survey of multi-objective optimization methods for engineering. Artificial Intelligence 271:18–72
Marler RT, Arora JS (2004) Survey of multi-objective optimization methods for engineering. Structural and Multidisciplinary Optimization 26(6):369–395
M. Mohri, A. Rostamizadeh, and A. Talwalker. Foundations of machine learning. MIT press, 2012.
E. Muraki and D. Bock. PARSCALE 4. Lincolnwood, IL: Scientific Software Inc., 2003.
Parmar C, Gossman P, Bussink J, Lambin P, Aerts HJWL (2015) Machine learning methods for quantitative radiomic biomarkers. Scientific reports 5(13087):1–11
Pliakos K, Seang-Hwane J, Park JY, Cornillie F, Vens C, Van den Noortgate W (2019) Integrating machine learning into item response theory for addressing the cold start problem in adaptive learning systems. Computers and Education 137:91–103
G. Rasch. Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research, Copenhagen, Denmark, 1960.
Rumsfeld JS, Joynt KE, Maddox TM (2016) Big data analytics to improve cardiovascular care: promise and challenges. Nature Reviews Cardiology 13(6):350–359
Samejima F (1969) Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement 34(4):100
Wolberg WH, Street WN, Heisy DM, Mangasarian OL (1995) Computer-derived nuclear features distinguish malignant from benign breast cytology. Human Pathology 26(7):272–796
Zhang Z, Beck MW, Winkler DA, Huang B, Sibanda W, Goyal H (2018) Opening the black box of neural networks: methods for interpreting neural network models in clinical applications. Annals of Translational Medicine 6(11):216–226
Zimowksi MF (2018) BILOG-MG. In: Frey BB (ed) The SAGE encyclopedia of educational research, measurement, and evaluation. Sage, Thousand Oaks, California, pp 199–202
Zimowski M, Muraki E, Mislevy R, Bock D (2003) BILOG-MG, vol 3. Scientific Software Inc., Lincolnwood
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kline, A.S., Kline, T.J.B. & Lee, J. Item response theory as a feature selection and interpretation tool in the context of machine learning. Med Biol Eng Comput 59, 471–482 (2021). https://doi.org/10.1007/s11517-020-02301-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-020-02301-x