Abstract
In practical applications, supervised learning algorithms, including support vector machine (SVM), heavily rely on precise labeling to train predictive models. Nonetheless, real-world datasets often comprise mislabeled samples, which can have considerable influence on the performance of these algorithms. On the other hand, SVM suffers from computational costs when facing large-scale datasets. Twin support vector machine (TWSVM) tackles this issue and finds two nonparallel hyperplanes by solving two smaller models compared to SVM such that each one is closer to one of the two classes and is at least a unit distance far away from the samples of the other class. In this paper, to address label noise in datasets, we propose a TWSVM-based mixed-integer programming model for relabeling instances directly, while inheriting the advantages of TWSVM. Each model decides whether the samples of one class should be considered among instances that are as close as possible to its corresponding hyperplane. Therefore, each model exhibits the ability to recognize instances bearing close resemblance to one class while their assigned labels belong to the other one, prompting their reclassification. Conversely, instances demonstrating lower similarities to the other class retain their original labels. To show the efficiency of proposed models experiments are conducted on 12 UCI datasets.
Similar content being viewed by others
Availability of Data
The data for this study are taken from standard library that are available to public.
References
Cortes C, Vapnik VN (1995) Support vector networks. Mach Learn 20(3):273–297
Vapnik VN (1996) The nature of statistical learning theory. Springer, NewYork
Vapnik VN (1998) Statistical learning theory. John Wiley & Sons, NewYork
Kshirsagar AP, Shakkeera L (2022) Recognizing abnormal activity using MultiClass SVM classification approach in tele-health care. In: IOT with Smart Systems: Proceedings of ICTIS 2021, vol 2. Springer Singapore, pp 739–750
Abdi A, Nabi RM, Sardasht M, Mahmood R (2022) Multiclass classifiers for stock price prediction: a comparison study. J Harbin Inst Technol 54(3):32–39
Witoonchart P, Chongstitvatana P (2017) Application of structured support vector machine backpropagation to a convolutional neural network for human pose estimation. Neural Netw 92:39–46
Jayadeva R, Khemchandani S (2007) Chandra, Twin support vector machines for pattern classification. IEEE Trans Pattern Anal Mach Intell 29(5):905–910
Ding S, Zhang N, Zhang X, Wu F (2017) Twin support vector machine: theory, algorithm and applications. Neural Comput Appl 28(11):3119–30
Ding S, Zhao X, Zhang J, Zhang X, Xue Y (2019) A review on multi-class TWSVM. Artif Intell Rev 52(2):775–801
Ding S, An Y, Zhang X, Wu F, Xue Y (2017) Wavelet twin support vector machines based on glowworm swarm optimization. Neurocomputing 225:157–63
Nasiri JA, Mir AM (2020) An enhanced KNN-based twin support vector machine with stable learning rules. Neural Comput Appl 16:12949–69
Jimenez-Castano C, Alvarez-Meza A, Orozco-Gutierrez A (2020) Enhanced automatic twin support vector machine for imbalanced data classification. Pattern Recogn 107:107442
McLachlan GJ (1972) Asymptotic results for discriminant analysis when the initial samples are misclassified. Technometrics 14(2):415–422
Lachenbruch PA (1966) Discriminant analysis when the initial samples are misclassified. Technometrics 8(4):657–662
Lachenbruch PA (1979) Note on initial misclassification effects on the quadratic discriminant function. Technometrics 21(1):129–132
Okamoto S, Yugami N (1997) An average-case analysis of the k-nearest neighbour classifier for noisy domains. In: 15th International Joint Conference on Artificial Intelligence (IJCAI), pp 238–245
Biggio B, Nelson B, Laskov P (2011) Support vector machines under adversarial label noise. In: Asian Conference on Machine Learning, pp 97–112
Ekambaram R, Fefilatyev S, Shreve M, Kramer K, Hall LO, Goldgof DB (2016) Active cleaning of label noise. Pattern Recogn 51:463–480
Duan Y, Wu O (2018) Learning with auxiliary less-noisy labels. IEEE Trans Neural Netw Learn Syst 28(7):1716–1721
Thulasidasan S, Bhattacharya T, Bilmes J, Chennupati G, Mohd-Yusof J (2019) Combating label noise in deep learning using abstention. Preprint at https://arxiv.org/abs/1905.10964
Blanco V, Japn A, Puerto J (2022) A mathematical programming approach to SVM-based classification with label noise. Comput Ind Eng 172:108611
Duda RO, Hart PE, Stork DG (2012) Pattern classification. John Wiley & Sons
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Iman RL, Davenport JM (1980) Approximations of the critical region of the Fbietkan statistic. Commun Stat - Theory Methods 9(6):571–595
Hao P (2010) New support vector algorithms with parametric insensitive margin model. Neural Netw 23(1):60–73
Peng X (2011) TPMSVM: A novel twin parametric-margin support vector machine for pattern recognition. Pattern Recogn 44(10–11):2678–2692
Funding
There is no funding for this research.
Author information
Authors and Affiliations
Contributions
Ali Sahleh prepared the first draft and performed experiments. Maziar Salahi revised the draft and approved the results and numerical experiments.
Corresponding author
Ethics declarations
Ethics Approval
Not applicable.
Ethical and Informed Consent for Data Used
Not applicable.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sahleh, A., Salahi, M. Relabeling Noisy Labels: A Twin SVM Approach. Oper. Res. Forum 4, 89 (2023). https://doi.org/10.1007/s43069-023-00273-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43069-023-00273-w