[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

Relabeling Noisy Labels: A Twin SVM Approach

  • Research
  • Published:
Operations Research Forum Aims and scope Submit manuscript

Abstract

In practical applications, supervised learning algorithms, including support vector machine (SVM), heavily rely on precise labeling to train predictive models. Nonetheless, real-world datasets often comprise mislabeled samples, which can have considerable influence on the performance of these algorithms. On the other hand, SVM suffers from computational costs when facing large-scale datasets. Twin support vector machine (TWSVM) tackles this issue and finds two nonparallel hyperplanes by solving two smaller models compared to SVM such that each one is closer to one of the two classes and is at least a unit distance far away from the samples of the other class. In this paper, to address label noise in datasets, we propose a TWSVM-based mixed-integer programming model for relabeling instances directly, while inheriting the advantages of TWSVM. Each model decides whether the samples of one class should be considered among instances that are as close as possible to its corresponding hyperplane. Therefore, each model exhibits the ability to recognize instances bearing close resemblance to one class while their assigned labels belong to the other one, prompting their reclassification. Conversely, instances demonstrating lower similarities to the other class retain their original labels. To show the efficiency of proposed models experiments are conducted on 12 UCI datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Availability of Data

The data for this study are taken from standard library that are available to public.

References

  1. Cortes C, Vapnik VN (1995) Support vector networks. Mach Learn 20(3):273–297

    Article  Google Scholar 

  2. Vapnik VN (1996) The nature of statistical learning theory. Springer, NewYork

    Google Scholar 

  3. Vapnik VN (1998) Statistical learning theory. John Wiley & Sons, NewYork

    Google Scholar 

  4. Kshirsagar AP, Shakkeera L (2022) Recognizing abnormal activity using MultiClass SVM classification approach in tele-health care. In: IOT with Smart Systems: Proceedings of ICTIS 2021, vol 2. Springer Singapore, pp 739–750

  5. Abdi A, Nabi RM, Sardasht M, Mahmood R (2022) Multiclass classifiers for stock price prediction: a comparison study. J Harbin Inst Technol 54(3):32–39

    Google Scholar 

  6. Witoonchart P, Chongstitvatana P (2017) Application of structured support vector machine backpropagation to a convolutional neural network for human pose estimation. Neural Netw 92:39–46

    Article  PubMed  Google Scholar 

  7. Jayadeva R, Khemchandani S (2007) Chandra, Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910

    Article  CAS  PubMed  Google Scholar 

  8. Ding S, Zhang N, Zhang X, Wu F (2017) Twin support vector machine: theory, algorithm and applications. Neural Comput Appl 28(11):3119–30

    Article  Google Scholar 

  9. Ding S, Zhao X, Zhang J, Zhang X, Xue Y (2019) A review on multi-class TWSVM. Artif Intell Rev 52(2):775–801

    Article  Google Scholar 

  10. Ding S, An Y, Zhang X, Wu F, Xue Y (2017) Wavelet twin support vector machines based on glowworm swarm optimization. Neurocomputing 225:157–63

    Article  Google Scholar 

  11. Nasiri JA, Mir AM (2020) An enhanced KNN-based twin support vector machine with stable learning rules. Neural Comput Appl 16:12949–69

    Article  Google Scholar 

  12. Jimenez-Castano C, Alvarez-Meza A, Orozco-Gutierrez A (2020) Enhanced automatic twin support vector machine for imbalanced data classification. Pattern Recogn 107:107442

    Article  Google Scholar 

  13. McLachlan GJ (1972) Asymptotic results for discriminant analysis when the initial samples are misclassified. Technometrics 14(2):415–422

    Article  Google Scholar 

  14. Lachenbruch PA (1966) Discriminant analysis when the initial samples are misclassified. Technometrics 8(4):657–662

    Article  MathSciNet  Google Scholar 

  15. Lachenbruch PA (1979) Note on initial misclassification effects on the quadratic discriminant function. Technometrics 21(1):129–132

    Article  MathSciNet  Google Scholar 

  16. Okamoto S, Yugami N (1997) An average-case analysis of the k-nearest neighbour classifier for noisy domains. In: 15th International Joint Conference on Artificial Intelligence (IJCAI), pp 238–245

  17. Biggio B, Nelson B, Laskov P (2011) Support vector machines under adversarial label noise. In: Asian Conference on Machine Learning, pp 97–112

  18. Ekambaram R, Fefilatyev S, Shreve M, Kramer K, Hall LO, Goldgof DB (2016) Active cleaning of label noise. Pattern Recogn 51:463–480

    Article  ADS  Google Scholar 

  19. Duan Y, Wu O (2018) Learning with auxiliary less-noisy labels. IEEE Trans Neural Netw Learn Syst 28(7):1716–1721

    Article  MathSciNet  Google Scholar 

  20. Thulasidasan S, Bhattacharya T, Bilmes J, Chennupati G, Mohd-Yusof J (2019) Combating label noise in deep learning using abstention. Preprint at https://arxiv.org/abs/1905.10964

  21. Blanco V, Japn A, Puerto J (2022) A mathematical programming approach to SVM-based classification with label noise. Comput Ind Eng 172:108611

    Article  Google Scholar 

  22. Duda RO, Hart PE, Stork DG (2012) Pattern classification. John Wiley & Sons

  23. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  Google Scholar 

  24. Iman RL, Davenport JM (1980) Approximations of the critical region of the Fbietkan statistic. Commun Stat - Theory Methods 9(6):571–595

    Article  Google Scholar 

  25. Hao P (2010) New support vector algorithms with parametric insensitive margin model. Neural Netw 23(1):60–73

    Article  PubMed  Google Scholar 

  26. Peng X (2011) TPMSVM: A novel twin parametric-margin support vector machine for pattern recognition. Pattern Recogn 44(10–11):2678–2692

    Article  ADS  Google Scholar 

Download references

Funding

There is no funding for this research.

Author information

Authors and Affiliations

Authors

Contributions

Ali Sahleh prepared the first draft and performed experiments. Maziar Salahi revised the draft and approved the results and numerical experiments.

Corresponding author

Correspondence to Maziar Salahi.

Ethics declarations

Ethics Approval

Not applicable.

Ethical and Informed Consent for Data Used

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sahleh, A., Salahi, M. Relabeling Noisy Labels: A Twin SVM Approach. Oper. Res. Forum 4, 89 (2023). https://doi.org/10.1007/s43069-023-00273-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43069-023-00273-w

Keywords

Navigation