[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Enhanced algorithm for high-dimensional data classification

Published: 01 March 2016 Publication History

Abstract

Graphical abstractIllustration of the decision hyperplanes generated by TSSVM, MCVSVM, and LMLP on an artificial dataset. Display Omitted HighlightsIn the case of the singularity of the within-class scatter matrix, the drawbacks of both MCVSVM and LMLP are analyzed.A novel algorithm TSSVM is proposed to deal with the high-dimensional data classification task where the within-class scatter matrix is singular.An alternative version of the nonlinear MCVSVM and the nonlinear LMLP are proposed.The nonlinear TSSVM is developed. Minimum class variance support vector machine (MCVSVM) and large margin linear projection (LMLP) classifier, in contrast with traditional support vector machine (SVM), take the distribution information of the data into consideration and can obtain better performance. However, in the case of the singularity of the within-class scatter matrix, both MCVSVM and LMLP only exploit the discriminant information in a single subspace of the within-class scatter matrix and discard the discriminant information in the other subspace. In this paper, a so-called twin-space support vector machine (TSSVM) algorithm is proposed to deal with the high-dimensional data classification task where the within-class scatter matrix is singular. TSSVM is rooted in both the non-null space and the null space of the within-class scatter matrix, takes full advantage of the discriminant information in the two subspaces, and so can achieve better classification accuracy. In the paper, we first discuss the linear case of TSSVM, and then develop the nonlinear TSSVM. Experimental results on real datasets validate the effectiveness of TSSVM and indicate its superior performance over MCVSVM and LMLP.

References

[1]
B. Scholkopf, A. Smola, Learning with Kernels, MIT Press, Cambridge, MA, 2002.
[2]
F. Camastra, A. Verri, A novel kernel method for clustering, IEEE Trans. Pattern Anal. Mach. Intell. 27 (5) (2005) 801-805.
[3]
X.M. Wang, F.L. Chung, S.T. Wang, Theoretical analysis for solution of support vector data description, Neural Netw. 24 (4) (2011) 360-369.
[4]
Z. Harchaoui, F. Bach, O. Cappe, E. Moulines, Kernel-based methods for hypothesis testing: a unified view, IEEE Signal Process. Mag. 30 (4) (2013) 87-97.
[5]
V. Vapnik, The Nature of Statistical Learning Theory, Springer Verlag, New York, 1995.
[6]
J. Ruan, X. Wang, Y. Shi, Developing fast predictors for large-scale time series using fuzzy granular support vector machines, Appl. Soft Comput. 13 (9) (2013) 3981-4000.
[7]
M. Sabzekar, M. Naghibzadeh, Fuzzy c-means improvement using relaxed constraints support vector machines, Appl. Soft Comput. 13 (2) (2013) 881-890.
[8]
C.H. Wu, Y. Ken, T. Huang, Patent classification system using a new hybrid genetic algorithm support vector machine, Appl. Soft Comput. 10 (4) (2010) 1164-1177.
[9]
D.H. Liu, H. Qian, G. Dai, Z.H. Zhang, An iterative SVM approach to feature selection and classification in high-dimensional datasets, Pattern Recognit. 46 (9) (2013) 2531-2537.
[10]
Z. Wang, Y.H. Shao, T.R. Wu, A GA-based model selection for smooth twin parametric-margin support vector machine, Pattern Recognit. 46 (8) (2013) 2267-2277.
[11]
C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining Knowl. Discov. 2 (2) (1998) 121-167.
[12]
B. Scholkopf, A. Smola, A tutorial on support vector regression, Stat. Comput. 14 (3) (2004) 199-222.
[13]
B. Scholkopf, A.J. Smola, R.C. Williamson, P.L. Bartlett, New support vector algorithms, Neural Comput. 12 (5) (2000) 1207-1245.
[14]
S. Zafeiriou, A. Tefas, I. Pitas, Minimum class variance support vector machines, IEEE Trans. Image Process. 16 (10) (2007) 2551-2564.
[15]
Q.X. Gao, J.J. Liu, H.J. Zhang, J. Hou, X.J. Yang, Enhanced fisher discriminant criterion for image recognition, Pattern Recognit. 45 (10) (2012) 3717-3724.
[16]
A. Rozza, G. Lombardi, E. Casiraghi, P. Campadelli, Novel fisher discriminant classifiers, Pattern Recognit. 45 (10) (2012) 3725-3737.
[17]
I. Kotsia, I. Pitas, S. Zafeiriou, Novel multiclass classifiers based on the minimization of the within-class variance, IEEE Trans. Neural Netw. 20 (1) (2009) 14-34.
[18]
M. Wang, F.L. Chung, S.T. Wang, On minimum class locality preserving variance support vector machine, Pattern Recognit. 43 (8) (2010) 2753-2762.
[19]
J. Yang, A.F. Frangi, J. Yang, D. Zhang, Z. Jin, KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition, IEEE Trans. Pattern Anal. Mach. Intell. 27 (2) (2005) 230-244.
[20]
P. Howland, J. Wang, H. Park, Solving the small sample size problem in face recognition using generalized discriminant analysis, Pattern Recognit. 39 (2) (2006) 277-287.
[21]
A.K. Qin, P.N. Suganthan, M. Loog, Generalized null space uncorrelated Fisher discriminant analysis for linear dimensionality reduction, Pattern Recognit. 39 (9) (2006) 1805-1808.
[22]
D.L. Chu, G.S. Thye, A new and fast implementation for null space based linear discriminant analysis, Pattern Recognit. 43 (4) (2010) 1373-1379.
[23]
A. Sharma, K.K. Paliwal, A new perspective to null linear discriminant analysis method and its fast implementation using random matrix multiplication with scatter matrices, Pattern Recognit. 45 (6) (2012) 2205-2213.
[24]
X.X. Zhang, Y.D. Jia, A linear discriminant analysis framework based on random subspace for face recognition, Pattern Recognit. 40 (9) (2007) 2585-2591.
[25]
F.X. Song, J.Y. Yang, S.H. Liu, Large margin linear projection and face recognition, Pattern Recognit. 37 (9) (2004) 1953-1955.
[26]
J. Yang, J.Y. Yang, Why can LDA be performed in PCA transformed space? Pattern Recognit. 36 (2) (2003) 563-566.
[27]
R. Fletcher, Practical Methods of Optimization, 2nd ed., Wiley, New York, 1987.
[28]
K.I. Diamantaras, S.Y. Kung, Principal Component Neural Networks, Wiley, New York, 1996.
[29]
C.W. Hsu, C.C. Chang, C.J. Lin, A practical guide to support vector classification. Technical report, Department of Computer Science and Information Engineering, University of National Taiwan, Taipei, 2003.
[30]
A. Scholkopf, B. Smola, K.R. Muller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10 (5) (1998) 1299-1319.
[31]
X.F. He, S.C. Yan, Y.X. Hu, P. Niyogi, H.J. Zhang, Face recognition using laplacian-faces, IEEE Trans. Pattern Anal. Mach. Intell. 27 (3) (2005) 328-340.
[32]
T. Sim, S. Baker, M. Bsat, The CMU pose, illumination, and expression (PIE) database, in: Proc. IEEE Intl Conf. Automatic Face and Gesture Recognition, May 2002.
[33]
S. Nene, S.K. Nayar, H. Murase, Columbia object image library (coil-20). Technical report, 1996.
[34]
L.G. Abril, C. Angulo, F. Velasco, J.A. Ortega, A note on the bias in SVMs for multiclassification, IEEE Trans. Neural Netw. 19 (4) (2008) 723-725.
[35]
Yale University, Face Database, 2002 http://cvc.yale.edu/projects/yalefaces/yalefaces.html.
[36]
U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, et al., Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Cell Biol. 96 (12) (1999) 6745-6750.
[37]
D. Singh, P. Febbo, K. Ross, D. Jackson, J. Manola, et al., Gene expression correlates of clinical prostate cancer behavior, Cancer Cell 1 (2) (2002) 203-209.
[38]
M. Shipp, K. Ross, P. Tamayo, A. Weng, J. Kutok, et al., Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning, Nat. Med. 8 (1) (2002) 68-74.

Cited By

View all
  • (2018)Systematic Review of an Automated Multiclass Detection and Classification System for Acute Leukaemia in Terms of Evaluation and Benchmarking, Open Challenges, Issues and Methodological AspectsJournal of Medical Systems10.1007/s10916-018-1064-942:11(1-36)Online publication date: 1-Nov-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Applied Soft Computing
Applied Soft Computing  Volume 40, Issue C
March 2016
683 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 March 2016

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Systematic Review of an Automated Multiclass Detection and Classification System for Acute Leukaemia in Terms of Evaluation and Benchmarking, Open Challenges, Issues and Methodological AspectsJournal of Medical Systems10.1007/s10916-018-1064-942:11(1-36)Online publication date: 1-Nov-2018

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media