[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A simple and reliable instance selection for fast training support vector machine: : Valid Border Recognition

Published: 01 September 2023 Publication History

Abstract

Support vector machines (SVMs) are powerful statistical learning tools, but their application to large datasets can cause time-consuming training complexity. To address this issue, various instance selection (IS) approaches have been proposed, which choose a small fraction of critical instances and screen out others before training. However, existing methods have not been able to balance accuracy and efficiency well. Some methods miss critical instances, while others use complicated selection schemes that require even more execution time than training with all original instances, thus violating the initial intention of IS. In this work, we present a newly developed IS method called Valid Border Recognition (VBR). VBR selects the closest heterogeneous neighbors as valid border instances and incorporates this process into the creation of a reduced Gaussian kernel matrix, thus minimizing the execution time. To improve reliability, we propose a strengthened version of VBR (SVBR). Based on VBR, SVBR gradually adds farther heterogeneous neighbors as complements until the Lagrange multipliers of already selected instances become stable. In numerical experiments, the effectiveness of our proposed methods is verified on benchmark and synthetic datasets in terms of accuracy, execution time and inference time.

References

[1]
Almeida M.B., Braga A.D.P., Braga J.P., SVM-KM: speeding SVMs learning with a priori cluster selection and k-means, in: Proceedings. Vol. 1. Sixth Brazilian symposium on neural networks, IEEE, 2000, pp. 162–167.
[2]
Arnaiz-González Á., Díez-Pastor J.F., Rodríguez J.J., García-Osorio C., Instance selection of linear complexity for big data, Knowledge-Based Systems 107 (2016) 83–95.
[3]
Aslani M., Seipel S., A fast instance selection method for support vector machines in building extraction, Applied Soft Computing 97 (2020).
[4]
Aslani M., Seipel S., Efficient and decision boundary aware instance selection for support vector machines, Information Sciences 577 (2021) 579–598.
[5]
Birzhandi P., Youn H.Y., CBCH (clustering-based convex hull) for reducing training time of support vector machine, The Journal of Supercomputing 75 (2019) 5261–5279.
[6]
Bonnefoy A., Emiya V., Ralaivola L., Gribonval R., Dynamic screening: Accelerating first-order algorithms for the lasso and group-lasso, IEEE Transactions on Signal Processing 63 (19) (2015) 5121–5132.
[7]
Cervantes J., Li X., Yu W., Support vector machine classification based on fuzzy clustering for large data sets, in: MICAI 2006: Advances in artificial intelligence: 5th Mexican international conference on artificial intelligence, Apizaco, Mexico, November (2006) 13-17. proceedings 5, Springer Berlin Heidelberg, 2006, pp. 572–582.
[8]
Cervantes J., Li X., Yu W., Li K., Support vector machine classification for large data sets via minimum enclosing ball clustering, Neurocomputing 71 (4–6) (2008) 611–619.
[9]
Chen W.J., Shao Y.H., Li C.N., Deng N.Y., MLTSVM: A novel twin support vector machine to multi-label learning, Pattern Recognition 52 (2016) 61–74.
[10]
Chen J., Zhang C., Xue X., Liu C.L., Fast instance selection for speeding up support vector machines, Knowledge-Based Systems 45 (2013) 1–7.
[11]
Cristianini N., Shawe-Taylor J., An introduction to support vector machines and other kernel-based learning methods, Cambridge university press, 2000.
[12]
Flake G.W., Lawrence S., Efficient SVM regression training with SMO, Machine Learning 46 (2002) 271–290.
[13]
Ghaoui L., Viallon V., Rabbani T., Safe feature elimination in sparse supervised learning technical report no, EECS Department, University of California, Berkeley, 2010.
[14]
Guo J., Takahashi N., Nishi T., Convergence proof of a sequential minimal optimization algorithm for support vector regression, in: The 2006 IEEE international joint conference on neural network proceedings, IEEE, 2006, pp. 355–362.
[15]
Guo J., Takahashi N., Nishi T., A novel sequential minimal optimization algorithm for support vector regression, in: Neural information processing: 13th international conference, ICONIP 2006, Hong Kong, China, October (2006) 3–6. Proceedings, Part I 13, Springer Berlin Heidelberg, 2006, pp. 827–836.
[16]
Guo J., Takahashi N., Nishi T., Global convergence of SMO algorithm for support vector regression, IEEE Transactions on Neural Networks 19 (6) (2008) 971–982.
[17]
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
[18]
Huang X., Shi L., Suykens J.A., Sequential minimal optimization for SVM with pinball loss, Neurocomputing 149 (2015) 1596–1603.
[19]
Keerthi S.S., Shevade S.K., Bhattacharyya C., Murthy K.R.K., Improvements to Platt’s SMO algorithm for SVM classifier design, Neural Computation 13 (3) (2001) 637–649.
[20]
Koggalage R., Halgamuge S., Reducing the number of training samples for fast support vector machine classification, Neural Information Processing-Letters and Reviews 2 (3) (2004) 57–65.
[21]
Kordos M., Blachnik M., Scherer R., Fuzzy clustering decomposition of genetic algorithm-based instance selection for regression problems, Information Sciences 587 (2022) 23–40.
[22]
Li B., Wang Q., Hu J., A fast SVM training method for very large datasets, in: 2009 international joint conference on neural networks, IEEE, 2009, pp. 1784–1789.
[23]
Li J., Zhu Q., Wu Q., A parameter-free hybrid instance selection algorithm based on local sets with natural neighbors, Applied Intelligence 50 (5) (2020) 1527–1541.
[24]
Liu C., Wang W., Wang M., Lv F., Konan M., An efficient instance selection algorithm to reconstruct training set for support vector machine, Knowledge-Based Systems 116 (2017) 58–73.
[25]
Malhat M., El Menshawy M., Mousa H., El Sisi A., A new approach for instance selection: Algorithms, evaluation, and comparisons, Expert Systems with Applications 149 (2020).
[26]
Moran M., Cohen T., Ben-Zion Y., Gordon G., Curious instance selection, Information Sciences 608 (2022) 794–808.
[27]
Nalepa J., Kawulok M., Selecting training sets for support vector machines: a review, Artificial Intelligence Review 52 (2) (2018) 857–900.
[28]
Ndiaye E., Fercoq O., Gramfort A., Salmon J., Gap safe screening rules for sparse multi-task and multi-class models, Advances in Neural Information Processing Systems 28 (2015).
[29]
Ndiaye E., Fercoq O., Gramfort A., Salmon J., Gap safe screening rules for sparsity enforcing penalties, Journal of Machine Learning Research 18 (1) (2017) 4671–4703.
[30]
Olvera-López J.A., Carrasco-Ochoa J.A., Martínez-Trinidad J.F., A new fast prototype selection method based on clustering, Pattern Analysis and Applications 13 (2010) 131–141.
[31]
Pan X., Yang Z., Xu Y., Wang L., Safe screening rules for accelerating twin support vector machine classification, IEEE Transactions on Neural Networks and Learning Systems 29 (5) (2017) 1876–1887.
[32]
Platt J.C., Fast training of support vector machines using sequential minimal optimization, advances in kernel methods, Support Vector Learning 18 (1999) 5–208.
[33]
Qi Z., Tian Y., Shi Y., Robust twin support vector machine for pattern classification, Pattern Recognition 46 (1) (2013) 305–316.
[34]
Saha S., Sarker P.S., Saud A., Shatabda S., Newton M.H., Cluster-oriented instance selection for classification problems, Information Sciences 602 (2022) 143–158.
[35]
Shin H., Cho S., Neighborhood property–based pattern selection for support vector machines, Neural Computation 19 (3) (2007) 816–855.
[36]
Vapnik V., Statistical learning theory wiley, New York 1 (624) (1998) 2.
[37]
Wang D., Shi L., Selecting valuable training samples for SVMs via data structure analysis, Neurocomputing 71 (13–15) (2008) 2772–2781.
[38]
Xu Y., Yang Z., Pan X., A novel twin support-vector machine with pinball loss, IEEE Transactions on Neural Networks and Learning Systems 28 (2) (2017) 359–370.
[39]
Yang Z., Xu Y., A safe accelerative approach for pinball support vector machine classifier, Knowledge-Based Systems 147 (2018) 12–24.
[40]
Zhu Z., Wang Z., Li D., Du W., NearCount: Selecting critical instances based on the cited counts of nearest neighbors, Knowledge-Based Systems 190 (2020).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neural Networks
Neural Networks  Volume 166, Issue C
Sep 2023
740 pages

Publisher

Elsevier Science Ltd.

United Kingdom

Publication History

Published: 01 September 2023

Author Tags

  1. Instance selection
  2. Support vector machine
  3. Neighborhood approach
  4. Distance-based approach
  5. Valid border instance

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media