Abstract
Feature selection is an essential step in text classification tasks to enhance model performance, reduce computational complexity, and mitigate the risk of overfitting. Filter-based methods have gained popularity for their effectiveness and efficiency in selecting informative features. However, these methods often overlook feature correlations, resulting in the selection of redundant and irrelevant features while underestimating others. To address this limitation, this paper proposes FS-RSA (Feature Selection through Redundancy and Synergy Analysis), a novel method for text classification. FS-RSA aims to identify an optimal feature subset by considering feature interactions at a lower computational cost. It achieves this by evaluating features to optimize their synergy information and minimize redundancy within small subsets. The core principle of FS-RSA is that features offering similar classification information to the class variable are likely to be correlated and redundant, whereas features with high and low classification information can provide synergistic information. In the conducted experiments on five public datasets, FS-RSA was compared to five effective filter-based methods in text classification. It consistently achieved higher F1 scores with NB and SVM classifiers, highlighting its effectiveness in feature selection while significantly reducing dimensionality.
Similar content being viewed by others
Availability of supporting data
The data described in this article is publicly available at https://www.kaggle.com/datasets
References
Ashokkumar P, Srivastava G et al (2021) A Two-stage Text Feature Selection Algorithm for Improving Text Classification. ACM Trans Asian Low-Resour Lang Inf Process 20:1–19. https://doi.org/10.1145/3425781
Apté C, Damerau F, Weiss SM (1994) Automated learning of decision rules for text categorization. ACM Trans Inf Syst 12:233–251. https://doi.org/10.1145/183422.183423
Basu A, Watters C, Shepherd M (2003) Support vector machines for text categorization. In: 36th Annual hawaii international conference on system sciences. https://doi.org/10.1109/HICSS.2003.1174243
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537–550. https://doi.org/10.1109/72.298224
Behera SK, Dash R (2022) A novel feature selection technique for enhancing performance of unbalanced text classification problem. IDT 16:51–69. https://doi.org/10.3233/IDT-210057
Bell AJ (2003) The co-information lattice. 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA2003). Nara, Japan 2003:921–926
Bennasar M, Hicks Y, Setchi R (2015) Feature selection using Joint Mutual Information Maximisation. Expert Syst Appl 42:8520–8532. https://doi.org/10.1016/j.eswa.2015.07.007
Chechik G, Globerson A, Anderson MJ et al (2002) Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems 14. The MIT Press, pp 173–180
Chen J, Huang H, Tian S, Qu Y (2009) Feature selection for text classification with Naïve Bayes. Expert Sys Appl 36(3):5432–5435. https://doi.org/10.1016/j.eswa.2008.06.054
Chen Y, Han B, Hou P (2014) New feature selection methods based on context similarity for text categorization. In: Proceedings of the international conference on fuzzy systems and knowledge discovery. https://doi.org/10.1109/FSKD.2014.6980902
Cover T, Thomas J (2006) Elements of Information Theory. Wiley, New York
Craven M, Pasquo DD, Freitag D, McCallum A, Mitchell T, Nigam K, Slattery S (1998) Learning to extract symbolic knowledge from the World Wide Web. In: Proceedings of the fifteenth national tenth conference on Artificial intelligence Innovative applications of artificial intelligence AAAI’98/IAAI’98). American Association for Artificial Intelligence, USA, pp 509–516
Dhal P, Azad C (2022) A deep learning and multi-objective PSO with GWO based feature selection approach for text classification. In: Proceedings of International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE). IEEE, Greater Noida, India, pp 2140-2144
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
Fu G, Li B, Yang Y, Li C (2023) Re-ranking and TOPSIS-based ensemble feature selection with multi-stage aggregation for text categorization. Pattern Recogn Lett 168:47–56. https://doi.org/10.1016/j.patrec.2023.02.027
Georgieva-Trifonova T, Duraku M (2021) Research on N-grams feature selection methods for text classification. IOP Conf Ser: Mater Sci Eng 1031:012048. https://doi.org/10.1088/1757-899X/1031/1/012048
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. https://doi.org/10.1162/153244303322753616
Hidalgo JMG, Bringas GC, Sanz EP, Garcia FC (2006) Content based SMS spam filtering. In: Proceedings of the 2006 ACM Symposium on Document Engineering, Amsterdam, The Netherlands, pp 107-114. https://doi.org/10.1145/1166160.1166191
Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: Nédellec C, Rouveirol C (eds) Machine Learning: ECML-98. ECML 1998. Lect Notes Comput Sci (Lecture Notes in Artificial Intelligence). Springer, Berlin, Heidelberg. vol 1398. https://doi.org/10.1007/BFb0026683
Khurana A, Verma OP (2023) Optimal Feature Selection for Imbalanced Text Classification. IEEE Trans Artif Intell 4:135–147. https://doi.org/10.1109/TAI.2022.3144651
Kolluri J, Razia S (2020) WITHDRAWN: Text classification using Naïve Bayes classifier. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2020.10.058
Kumar V (2014) Feature Selection: A literature Review. SmartCR 4. https://doi.org/10.6029/smartcr.2014.03.007
Lei S (2012) A Feature Selection Method Based on Information Gain and Genetic Algorithm. In: 2012 International conference on computer science and electronics engineering. IEEE, Hangzhou, Zhejiang, China, pp 355–358. https://doi.org/10.1109/ICCSEE.2012.97
Liu X, Wang S, Lu S et al (2023) Adapting Feature Selection Algorithms for the Classification of Chinese Texts. Systems 11:483. https://doi.org/10.3390/systems11090483
Mamdouh Farghaly H, Abd El-Hafeez T (2023) A high-quality feature selection method based on frequent and correlated items for text classification. Soft Comput 27:11259–11274. https://doi.org/10.1007/s00500-023-08587-x
Mao KZ (2004) Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Trans Syst Man Cybernet Part B (Cybernetics) 34:629–634
McGill WJ (1954) Multivariate information transmission. Psychometrika 19:97–116. https://doi.org/10.1007/BF02289159
Miri M, Dowlatshahi MB, Hashemi A et al (2022) Ensemble feature selection for multi-label text classification: An intelligent order statistics approach. Int J of Intelligent Sys 37:11319–11341. https://doi.org/10.1002/int.23044
Ogura H, Amano H, Kondo M (2011) Comparison of metrics for feature selection in imbalanced text classification. Expert Syst Appl 38:4978–4989. https://doi.org/10.1016/j.eswa.2010.09.153
Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics–ACL ACL ’04. Association for Computational Linguistics, Barcelona, Spain, pp 271-es
Pintas JT, Fernandes LAF, Garcia ACB (2021) Feature selection methods for text classification: a systematic literature review. Artif Intell Rev 54:6149–6200. https://doi.org/10.1007/s10462-021-09970-6
Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119–1125. https://doi.org/10.1016/0167-8655(94)90127-9
Roul RK, Satyanath G (2022) A Novel Feature Selection Based Text Classification Using Multi-layer ELM. In: Roy PP, Agarwal A, Li T et al (eds) Big Data Analytics. Springer Nature Switzerland, Cham, pp 33–52
Saeed MM, Al Aghbari Z (2022) ARTC: feature selection using association rules for text classification. Neural Comput & Applic 34:22519–22529. https://doi.org/10.1007/s00521-022-07669-5
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34:1–47. https://doi.org/10.1145/505282.505283
Shang W, Huang H, Zhu H, Lin Y, Qu Y, Wang Z (2007) A novel feature selection algorithm for text categorization. Expert Syst Appl 33(1):1–5. https://doi.org/10.1016/j.eswa.2006.04.001
Stearns SD (1976) On selecting features for pattern classifiers. In: Pattern recognition, proceedings of the 3rd international conference on Pattern Recognition, Coronado, CA, pp 71-75
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. Data classification : algorithms and applications, pp 37-64. https://doi.org/10.1201/B17320
Timme N, Alford W, Flecker B, Beggs JM (2014) Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. J Comput Neurosci 36:119–140. https://doi.org/10.1007/s10827-013-0458-4
Uysal AK (2016) An improved global feature selection scheme for text classification. Expert Syst Appl 43:82–92. https://doi.org/10.1016/j.eswa.2015.08.050
Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput & Appl 24:175–186. https://doi.org/10.1007/s00521-013-1368-0
Williams PL, & Beer RD (2010) Nonnegative Decomposition of Multivariate Information. arXiv:1004.2515v1
Wolf D (1996) The Generalization of Mutual Information as the Information between a Set of Variables: The Information Correlation Function Hierarchy and the Information Structure of Multi-Agent Systems (Technical report). NASA Ames Research Center
Xue W, Xu X (2010) Three New Feature Weighting Methods for Text Categorization. In: Wang FL, Gong Z, Luo X, Lei J (eds) Web Information Systems and Mining. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 352–359
Yang J, Liu Y, Zhu X et al (2012) A new feature selection based on comprehensive measurement both in inter-category and intra-category for text categorization. Inform Process Manag 48:741–754. https://doi.org/10.1016/j.ipm.2011.12.005
Yap BW, Ibrahim NS, Hamid HA, Rahman SA, Fong SJ (2018) Feature selection methods: Case of filter and wrapper approaches for maximising classification accuracy. Pertanika J Sci Technol 26:329–340
Zheng Z, Wu X, Srihari RK (2004) Feature selection for text categorization on imbalanced data. SIGKDD Explor Newsl 6(1):80–89. https://doi.org/10.1145/1007730.1007741
Zingade DS, Deshmukh RK, Kadam DB (2023) Multi-objective Hybrid Optimization-based Feature Selection for Sentiment Analysis. In: Proceddings of the 4th international conference for emerging technology (INCET). IEEE, Belgaum, India, pp 1–6
Author information
Authors and Affiliations
Contributions
The authors, Farek Lazhar and Benaidja Amira, contributed equally to this work.
Corresponding author
Ethics declarations
Ethical Approval
This research did not contain any studies involving animal or human participants, nor did it take place on any private or protected areas.
Competing interests
The authors declare no conflicts of interest in preparing this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Farek, L., Benaidja, A. An optimal feature selection method for text classification through redundancy and synergy analysis. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19736-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-19736-1