Abstract
In bioinformatics, regularized linear discriminant analysis is commonly used as a tool for supervised classification problems tailor-made for high-dimensional data with the number of variables exceeding the number of observations. However, its various available versions are too vulnerable to the presence of outlying measurements in the data. In this paper, we exploit principles of robust statistics to propose new versions of regularized linear discriminant analysis suitable for high-dimensional data contaminated by (more or less) severe outliers. The work exploits a regularized version of the minimum weighted covariance determinant estimator, which is one of highly robust estimators of multivariate location and scatter. The performance of the novel classification methods is illustrated on real data sets with a detailed analysis of data from brain activity research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press, Princeton (2009)
Bühlmann, P., van de Geer, S.: Statistics for High-dimensional Data. Springer, New York (2011)
Chen, Y., Wiesel, A., Hero, A.O.: Robust shrinkage estimation of high dimensional covariance matrices. IEEE Trans. Sig. Process. 59, 4097–4107 (2011)
Croux, C., Dehon, C.: Robust linear discriminant analysis using S-estimators. Can. J. Stat. 29, 473–493 (2001)
Davies, P.: Data Analysis and Approximate Models: Model Choice, Location-Scale, Analysis of Variance, Nonparametric Regression and Image Analysis. Chapman & Hall/CRC, Boca Raton (2014)
Davies, P.L., Gather, U.: Breakdown and groups. Ann. Stat. 33, 977–1035 (2005)
Duffau, H.: Brain Mapping: From Neural Basis of Cognition to Surgical Applications. Springer, Vienna (2011)
Dziuda, D.M.: Data Mining for Genomics and Proteomics: Analysis of Gene and Protein Expression Data. Wiley, New York (2010)
Filzmoser, P., Todorov, V.: Review of robust multivariate statistical methods in high dimension. Analytica Chinica Acta 705, 2–14 (2011)
Guo, Y., Hastie, T., Tibshirani, R.: Regularized discriminant analysis and its application in microarrays. Biostatistics 8, 86–100 (2007)
Han, H., Jiang, X.: Overcome support vector machine diagnosis overfitting. Cancer Inf. 13, 145–148 (2014)
Hansen, P.C.: Rank-deficient and Discrete Ill-posed Problems: Numerical Aspects of Linear Inversion. SIAM, Philadelphia (1998)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2008)
Hlinka, J., Paluš, M., Vejmelka, M., Mantini, D., Corbetta, M.: Functional connectivity in resting-state fMRI: is linear correlation sufficient? NeuroImage 54, 2218–2225 (2011)
Huber, P.J., Ronchetti, E.M.: Robust Statistics, 2nd edn. Wiley, New York (2009)
Hubert, M., Rousseeuw, P.J., van Aelst, S.: High-breakdown robust multivariate methods. Stat. Sci. 23, 92–119 (2008)
Hubert, M., Debruyne, M.: Minimal covariance determinant. Wiley Interdisc. Rev. Comput. Stat. 2, 36–43 (2010)
Jurečková, J., Portnoy, S.: Asymptotics for one-step M-estimators in regression with application to combining efficiency and high breakdown point. Commun. Stat. Theor. Methods 16, 2187–2199 (1987)
Kalina, J.: Implicitly weighted methods in robust image analysis. J. Math. Imag. Vis. 44, 449–462 (2012)
Kalina, J., Seidl, L., Zvára, K., Grünfeldová, H., Slovák, D., Zvárová, J.: System for selecting relevant information for decision support. Stud. Health Technol. Inf. 183, 83–87 (2013)
Kalina, J.: Classification analysis methods for high-dimensional genetic data. Biocybern. Biomed. Eng. 34, 10–18 (2014)
Kalina, J., Schlenker, A.: A robust and regularized supervised variable selection. BioMed Res. Int. (2015). Article no. 320385
Kindermans, P.-J., Schreuder, M., Schrauwen, B., Müller, K.-R., Tangermann, M.: True zero-training brain-computer interfacing-an online study. PLoS One 9 (2014). Article no. 102504
Kůrková, V., Sanguineti, M.: Learning with generalization capability by kernel methods of bounded complexity. J. Complex. 21, 350–367 (2005)
Lopuhaä, H.P., Rousseeuw, P.J.: Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Stat. 19, 229–248 (1991)
Maronna, R.A., Martin, D.R., Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, New York (2006)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)
Pourahmadi, M.: High-dimensional Covariance Estimation. Wiley, New York (2013)
Roelant, E., van Aelst, S., Willems, G.: The minimum weighted covariance determinant estimator. Metrika 70, 177–204 (2009)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)
Rousseeuw, P.J., van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)
Sreekumar, A., et al.: Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression. Nature 457, 910–914 (2009)
Steinwart, I., Christmann, A.: Support Vector Machines. Springer, New York (2008)
Tibshirani, R., Narasimhan, B.: Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Stat. Sci. 18, 104–117 (2003)
Todorov, V., Filzmoser, P.: An object-oriented framework for robust multivariate analysis. J. Stat. Softw. 32(3), 1–47 (2009)
Tyler, D.E.: A distribution-free M-estimator of multivariate scatter. Ann. Stat. 15, 234–251 (1987)
Tyler, D.E.: Breakdown properties of the M-estimators of multivariate scatter (2014). http://arxiv.org/pdf/1406.4904v1.pdf
Wager, T.D., Keller, M.C., Lacey, S.C., Jonides, J.: Increased sensitivity in neuroimaging analyses using robust regression. NeuroImage 26, 99–113 (2005)
Acknowledgments
Preliminary results were first presented at the BIOSTEC/BIOINFORMATICS 2016 conference (21–23 February 2016 in Rome), where they were published in the proceedings.
The work was supported by the project Nr. LO1611 with a financial support from the MEYS under the NPU I program. The work of J. Kalina was financially supported by the Neuron Fund for Support of Science. The work of J. Hlinka was supported by the Czech Science Foundation project No. 13-23940S.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kalina, J., Hlinka, J. (2017). Implicitly Weighted Robust Classification Applied to Brain Activity Research. In: Fred, A., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2016. Communications in Computer and Information Science, vol 690. Springer, Cham. https://doi.org/10.1007/978-3-319-54717-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-54717-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54716-9
Online ISBN: 978-3-319-54717-6
eBook Packages: Computer ScienceComputer Science (R0)