Abstract
Sentiment classification is one of the important tasks in text mining, which is to classify documents according to their opinion or sentiment. Documents in sentiment classification can be represented in the form of feature vectors, which are employed by machine learning algorithms to perform classification. For the feature vectors, the feature selection process is necessary. In this paper, we will propose a feature selection method called fitness proportionate selection binary particle swarm optimization (F-BPSO). Binary particle swarm optimization (BPSO) is the binary version of particle swam optimization and can be applied to feature selection domain. F-BPSO is a modification of BPSO and can overcome the problems of traditional BPSO including unreasonable update formula of velocity and lack of evaluation on every single feature. Then, some detailed changes are made on the original F-BPSO including using fitness sum instead of average fitness in the fitness proportionate selection step. The modified method is, thus, called fitness sum proportionate selection binary particle swarm optimization (FS-BPSO). Moreover, further modifications are made on the FS-BPSO method to make it more suitable for sentiment classification-oriented feature selection domain. The modified method is named as SCO-FS-BPSO where SCO stands for “sentiment classification-oriented”. Experimental results show that in benchmark datasets original F-BPSO is superior to traditional BPSO in feature selection performance and FS-BPSO outperforms original F-BPSO. Besides, in sentiment classification domain, SCO-FS-BPSO which is modified specially for sentiment classification is superior to traditional feature selection methods on subjective consumer review datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abbasi A, Chen H, Salem A (2008) Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans Inf Syst 26(3):12
Basu T, Murthy C (2012) Effective text classification by a supervised feature selection approach. In: 2012 IEEE 12th international conference on data mining workshops (ICDMW), pp 918–925. IEEE
Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian conference on artificial intelligence, pp 313–325. Springer, New York
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1):131–156
Dong Z, Dong Q (2000) Hownet
Eberhart R, Simpson P, Dobbins R (1996) Computational intelligence PC tools. Academic Press Professional Inc, San Diego
Engelbrecht AP (2005) Fundamentals of computational swarm intelligence. Wiley, New York
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
Jin Y, Xiong W, Wang C (2010) Feature selection for chinese text categorization based on improved particle swarm optimization. In: 2010 International conference on natural language processing and knowledge engineering (NLP-KE), pp 1–6. IEEE
Kennedy J (2003) Bare bones particle swarms. In: Proceedings of IEEE swarm intelligence symposium, pp 80–87
Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc IEEE Int Conf Neural Netw 4:1942–1948
Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm optimization. In: Proceedings of IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, vol 5, pp 4104–4108
Khanesar MA, Teshnehlab M, Shoorehdeli MA (2007) A novel binary particle swarm optimization. In: IEEE mediterranean conference on control and automation, pp 1–6
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Lee S, Soak S, Oh S, Pedrycz W, Jeon M (2008) Modified binary particle swarm optimization. Progr Nat Sci 18(9):1161–1166
Liao C-J, Tseng C-T, Luarn P (2007) A discrete version of particle swarm optimization for flowshop scheduling problems. Computers Oper Res 34(10):3099–3111
Liu X, Shang L (2013) A fast wrapper feature subset selection method based on binary particle swarm optimization. In: Proceedings of IEEE congress on evolutionary computation, pp 3347–3353
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135
Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, vol 10, pp 79–86. Association for Computational Linguistics
Qiu B, Zhao K, Mitra P, Wu D, Caragea C, Yen J, Greer GE, Portier K (2011) Get online support, feel better–sentiment analysis and dynamics in an online cancer survivor community. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE third inernational conference on social computing (SocialCom), pp 274–281. IEEE
Sadri J, Sadri CY (2006) A genetic binary particle swarm optimization model. In: IEEE congress on evolutionary computation, pp 656–663
Shi X, Liang Y, Lee H, Lu C, Wang Q (2007) Particle swarm optimization-based algorithms for tsp and generalized tsp. Inf Process Lett 103(5):169–176
Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of IEEE world congress on computational intelligence, pp 69–73
Song Q, Ni J, Wang G (2013) A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans Knowl Data Eng 25(1):1–14
Tasgetiren MF, Liang Y-C (2004) A binary particle swarm optimization algorithm for lot sizing problem. J Econ Soc Res 5(2):1–20
Trelea IC (2003) The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf Process Lett 85(6):317–325
Wang M, Cao D, Li L, Li S, Ji R (2014) Microblog sentiment analysis based on cross-media bag-of-words model. In: Proceedings of international conference on internet multimedia computing and service, p 76. ACM
Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28(4):459–471
Xue B, Cervante L, Shang L, Browne WN, Zhang M (20104) Binary PSO and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(2)
Xue B, Zhang M, Browne WN (2013) Novel initialisation and updating mechanisms in pso for feature selection in classification. In: EvoApplications, pp 428–438
Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. ICML 97:412–420
Yang S, Wang M, Jiao L (2004) A quantum particle swarm optimization. In: IEEE congress on evolutionary computation, vol 1, pp 320–324
Zhou Z, Liu X, Li P, Shang L (2014) Feature selection method with proportionate fitness based binary particle swarm optimization. In: Simulated evolution and learning, pp 582–592. Springer, New York
Acknowledgments
This work is supported by the National Natural Science Foundation of China (NSFC No. 61170180, NSFC No. 61403200).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Communicated by B. Xue and A. G. Chen.
Rights and permissions
About this article
Cite this article
Shang, L., Zhou, Z. & Liu, X. Particle swarm optimization-based feature selection in sentiment classification. Soft Comput 20, 3821–3834 (2016). https://doi.org/10.1007/s00500-016-2093-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2093-2