[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3654446.3654519acmotherconferencesArticle/Chapter ViewAbstractPublication PagesspcncConference Proceedingsconference-collections
research-article

Feature selection SVM through Universum and its applications on text classification Feature selection SVM through Universum

Published: 03 May 2024 Publication History

Abstract

The continuous emergence of digital texts makes text classification one of the key tasks. Support Vector Machine (SVM) has become a widely used classification tool due to its strong generalization ability and dependence on a few parameters. However, SVM was not originally designed to determine relevant features. This research focuses on applying SVM and Universum learning to text classification, exploring their effects on processing unlabeled data, enhancing generalization capabilities, and solving the problem of feature selection. By introducing the Universum set, we construct an extended data set and incorporate the concept of Universum embedding. We propose a Feature Selection Universum Support Vector Machine (FSUSVM). This model introduces constraints on the prediction boundaries of the Universum set on this extended dataset to ensure its robust performance in terms of feature selection. Specifically, by incorporating constraints on the prediction boundaries of the Universum set into the existing SVM model, we aim to optimize the model's accuracy and feature selection performance in text classification tasks. F Finally, we substantiated the effectiveness of FSUSVM through numerical experiments conducted on text data. Additionally, we evaluated FSUSVM on image data, yielding positive results.

References

[1]
Deepak Agnihotri, Kesari Verma, Priyanka Tripathi, and Bikesh Kumar Singh. 2019. Soft voting technique to improve the performance of global filter based feature selection in text corpus. Appl. Intell. (April 2019), 1597-1619. https://doi.org/10.1007/s10489-018-1349-1.
[2]
Tian Xia and Xuemin Chen. 2021. A weighted feature enhanced Hidden Markov Model for spam SMS filtering. Neurocomputing (July 2021), 48-58. https://doi.org/10.1016/j.neucom.2021.02.075.
[3]
Aytuğ Onan. 2018. An ensemble scheme based on language function analysis and feature engineering for text genre classification. J. Inf. Sci. (December 2018), 28-47. https://doi.org/10.1177/0165551516677911.
[4]
Bashar Ahmed. 2020. Wrapper feature selection approach based on binary firefly algorithm for spam E-mail filtering. Journal of Soft Computing and Data Mining (2020), 44-52.
[5]
Avinash Madasu and Sivasankar Elango. 2020. Efficient feature selection techniques for sentiment analysis. Multimed. Tools Appl. (2020), 6313-6335. https://doi.org/10.1007/s11042-019-08409-z.
[6]
Bekir Parlak and Alper Kürşat Uysal. 2020. On classification of abstracts obtained from medical journals. J. Inf. Sci. (2020), 648-663. https://doi.org/10.1177/0165551519860982.
[7]
Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. J. Mach. Learn. Res. (2003), 1157-1182.
[8]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. (1995), 273-297. https://doi.org/10.1007/BF00994018.
[9]
Edoardo Amaldi and Viggo Kann. 1998. On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theor. Comput. Sci. (December 1998), 237-260. https://doi.org/10.1016/S0304-3975(97)00115-1.
[10]
Hui Zou An improved 1-norm svm for simultaneous classification and variable selection. PMLR, 2007.
[11]
Jason Weston, Ronan Collobert, Fabian Sinz, Léon Bottou, and Vladimir Vapnik. Inference with the universum., 2006. https://doi.org/10.1145/1143844.1143971.
[12]
Bharat Richhariya, Muhammad Tanveer, Ashraf Haroon Rashid, and Alzheimer S. Disease Neuroimaging Initiative. 2020. Diagnosis of Alzheimer's disease using universum support vector machine based recursive feature elimination (USVM-RFE). Biomed. Signal Process. Control (2020). https://doi.org/10.1016/j.bspc.2020.101903.
[13]
V. Murugesan and P. Balamurugan. 2023. Breast Cancer Classification by Gene Expression Analysis using Hybrid Feature Selection and Hyper-heuristic Adaptive Universum Support Vector Machine. Int. J. Electr. Comput. Eng. Syst. (2023), 241-249.
[14]
Julia Neumann, Christoph Schnörr, and Gabriele Steidl. 2005. Combined SVM-based feature selection and classification. Mach. Learn. (2005), 129-150. https://doi.org/10.1007/s10994-005-1505-9.
[15]
Ji Zhu, Saharon Rosset, Robert Tibshirani, and Trevor Hastie. 2003. 1-norm support vector machines. Advances in neural information processing systems (2003).
[16]
C. Van Rijsbergen Information retrieval: theory and practice., 1979.
[17]
Kent A. Spackman. Signal detection theory: Valuable tools for evaluating inductive learning. Elsevier, 1989. https://doi.org/10.1016/B978-1-55860-036-2.50047-3.

Index Terms

  1. Feature selection SVM through Universum and its applications on text classification Feature selection SVM through Universum

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SPCNC '23: Proceedings of the 2nd International Conference on Signal Processing, Computer Networks and Communications
    December 2023
    435 pages
    ISBN:9798400716430
    DOI:10.1145/3654446
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SPCNC 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 16
      Total Downloads
    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 10 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media