Extreme logistic regression

Che Ngufor¹ &
Janusz Wojtusiak²

1325 Accesses
14 Citations
Explore all metrics

Abstract

Kernel logistic regression (KLR) is a very powerful algorithm that has been shown to be very competitive with many state-of the art machine learning algorithms such as support vector machines (SVM). Unlike SVM, KLR can be easily extended to multi-class problems and produces class posterior probability estimates making it very useful for many real world applications. However, the training of KLR using gradient based methods or iterative re-weighted least squares can be unbearably slow for large datasets. Coupled with poor conditioning and parameter tuning, training KLR can quickly design matrix become infeasible for some real datasets. The goal of this paper is to present simple, fast, scalable, and efficient algorithms for learning KLR. First, based on a simple approximation of the logistic function, a least square algorithm for KLR is derived that avoids the iterative tuning of gradient based methods. Second, inspired by the extreme learning machine (ELM) theory, an explicit feature space is constructed through a generalized single hidden layer feedforward network and used for training iterative re-weighted least squares KLR (IRLS-KLR) and the newly proposed least squares KLR (LS-KLR). Finally, for large-scale and/or poorly conditioned problems, a robust and efficient preconditioned learning technique is proposed for learning the algorithms presented in the paper. Numerical results on a series of artificial and 12 real bench-mark datasets show first that LS-KLR compares favorable with SVM and traditional IRLS-KLR in terms of accuracy and learning speed. Second, the extension of ELM to KLR results in simple, scalable and very fast algorithms with comparable generalization performance to their original versions. Finally, the introduced preconditioned learning method can significantly increase the learning speed of IRLS-KLR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

References

Alcalá-Fdez J, Sánchez L, García S, Del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM et al (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318
Article Google Scholar
Bach FR, Jordan MI (2005) Predictive low-rank decomposition for kernel methods. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 33–40
Bache K, Lichman M (2013) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science
Benzi M (2002) Preconditioning techniques for large linear systems: a survey. J Comput Phys 182(2):418–477
Article MathSciNet MATH Google Scholar
Benzi M, Golub GH, Liesen J (2005) Numerical solution of saddle point problems. Acta Numer 14(1):1–137
Article MathSciNet MATH Google Scholar
Cawley GC, Talbot NLC (2004) Efficient model selection for kernel logistic regression. In: IEEE pattern recognition, 2004. ICPR 2004. Proceedings of the 17th international conference, vol 2, pp 439–442
Cawley GC, Talbot NLC (2008) Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach Learn 71(2–3):243–264
Article Google Scholar
Chu W, Ong CJ, Keerthi SS (2005) An improved conjugate gradient scheme to the solution of least squares svm. IEEE Trans Neural Netw 16(2):498–501
Article Google Scholar
De Kruif BJ, De Vries TJA (2003) Pruning error minimization in least squares support vector machines. IEEE Trans Neural Netw 14(3):696–702
Article Google Scholar
Fine S, Scheinberg K (2002) Efficient svm training using low-rank kernel representations. J Mach Learn Res 2:243–264
MATH Google Scholar
Frénay B, Verleysen M (2010) Using svms with randomised feature spaces: an extreme learning approach. In: ESANN
Gestel T, Suykens J, Lanckriet G, Lambrechts A, Moor B, Vandewalle J (2002) Bayesian framework for least-squares support vector machine classifiers, gaussian processes, and kernel fisher discriminant analysis. Neural Comput 14(5):1115–1147
Article MATH Google Scholar
Hager WW (1989) Updating the inverse of a matrix. SIAM Rev 31(2):221–239
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New York
Google Scholar
Hogben L (2006) Handbook of linear algebra. CRC Press, Boca Raton
Book Google Scholar
Huang G-B, Chen L, Siew C-K (2006a) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Article Google Scholar
Huang G-B, Zhu Q-Y, Siew C-K (2006b) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501
Article Google Scholar
Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74(1):155–163
Article Google Scholar
Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529
Article Google Scholar
Jiao L, Bo L, Wang L (2007) Fast sparse approximation for least squares support vector machine. IEEE Trans Neural Netw 18(3):685–697
Article Google Scholar
Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab—an S4 package for kernel methods in R. J Stat Softw 11(9):1–20. http://www.jstatsoft.org/v11/i09/. Accessed 21 Dec 2014
Katz M, Schaffner M, Andelic E, Krüger S, Wendemuth A (2005) Sparse kernel logistic regression for phoneme classification. In: Proceedings of 10th international conference on speech and computer (SPECOM), Citeseer, vol 2, pp 523–526
Keerthi SS, Shevade SK (2003) Smo algorithm for least-squares svm formulations. Neural Comput 15(2):487–507
Article MATH Google Scholar
Keerthi SS, Duan KB, Shevade SK, Poo AN (2005) A fast dual algorithm for kernel logistic regression. Mach Learn 61(1–3):151–165
Article MATH Google Scholar
Komarek P (2004) Logistic regression for data mining and high-dimensional classification. Robotics Institute, p 222
Kuh A (2004) Least squares kernel methods and applications. In: Soft computing in communications. Springer, Berlin Heidelberg, pp 365–387
Kulis B, Sustik M, Dhillon I (2006) Learning low-rank kernel matrices. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 505–512
Le Borne S, Ngufor C (2010) An implicit approximate inverse preconditioner for saddle point problems. Electron Trans Numer Anal 37:173–188
MathSciNet MATH Google Scholar
Liu Q, He Q, Shi Z (2008) Extreme support vector machine classifier. In: Advances in knowledge discovery and data mining. Springer, pp 222–233
Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. In: Philosophical transactions of the Royal Society of London. Series A, containing papers of a mathematical or physical character, vol 209, pp 415–446
Ngufor C, Wojtusiak J (2013) Learning from large-scale distributed health data: an approximate logistic regression approach. ICML 13: role of machine learning in transforming healthcare
R Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/ISBN3-900051-07-0
Ramani S, Fessler JA (2010) An accelerated iterative reweighted least squares algorithm for compressed sensing mri. In: 2010 IEEE international symposium, IEEE biomedical imaging: from nano to macro, pp 257–260
Suykens JAK, Lukas L, Van Dooren P, De Moor B, Vandewalle J (1999) Least squares support vector machine classifiers: a large scale algorithm. In: European conference on circuit theory and design, ECCTD, Citeseer, vol 99, pp 839–842
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Article MathSciNet Google Scholar
Suykens JAK, Lukas L, Vandewalle J (2000) Sparse approximation using least squares support vector machines. In: The 2000 IEEE international symposium on circuits and systems, 2000. IEEE Proceedings. ISCAS 2000 Geneva, vol 2, pp 757–760
Suykens JAK, De Brabanter J, Lukas L, Vandewalle J (2002a) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1):85–105
Article MATH Google Scholar
Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J, Suykens JAK, Van Gestel T (2002b) Least squares support vector machines, vol 4. World Scientific, Singapore
Zeng X, Chen X-W (2005) Smo-based pruning methods for sparse least squares support vector machines. IEEE Trans Neural Netw 16(6):1541–1546
Article Google Scholar
Zhu J, Hastie T (2002) Support vector machines, kernel logistic regression and boosting. In: Multiple classifier systems. Springer, pp 16–26
Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14(1):185–205
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Physics, Astronomy, and Computational Sciences, George Mason University, Fairfax, VA, 22030, USA
Che Ngufor
Department of Health Administration and Policy, George Mason University, Fairfax, VA, 22030, USA
Janusz Wojtusiak

Authors

Che Ngufor
View author publications
You can also search for this author in PubMed Google Scholar
Janusz Wojtusiak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Che Ngufor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ngufor, C., Wojtusiak, J. Extreme logistic regression. Adv Data Anal Classif 10, 27–52 (2016). https://doi.org/10.1007/s11634-014-0194-2

Download citation

Received: 06 February 2014
Revised: 23 November 2014
Accepted: 13 December 2014
Published: 31 December 2014
Issue Date: March 2016
DOI: https://doi.org/10.1007/s11634-014-0194-2

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Kernel-based regression via a novel robust loss function and iteratively reweighted least squares

Extreme Support Vector Regression

Fast online algorithm for nonlinear support vector machines and other alike models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Extreme logistic regression

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Kernel-based regression via a novel robust loss function and iteratively reweighted least squares

Extreme Support Vector Regression

Fast online algorithm for nonlinear support vector machines and other alike models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation