Abstract
Automated analysis of the ever-increasing amount of reviews available through the Web can enable businesses to identify why people like or dislike (aspects of) products or brands, yet to this end, a reliable indication of the intended sentiment of reviews is of crucial importance. This sentiment is typically quantified in universal star ratings, which are not always available. We propose and compare the performance of several statistical methods of automatically classifying star ratings of reviews represented by means of a binary vector representation, with features signaling the presence of sentiment-carrying words. A nearest neighbor classifier maximizes recall, whereas a naïve Bayes classifier excels in terms of precision, accuracy, and the root mean squared error of the assigned number of stars.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In: 7th Conference on International Language Resources and Evaluation (LREC 2010), pp. 2200–2204. European Language Resources Association (2010)
Heerschop, B., Goossen, F., Hogenboom, A., Frasincar, F., Kaymak, U., de Jong, F.: Polarity Analysis of Texts using Discourse Structure. In: 20th ACM Conference on Information and Knowledge Management (CIKM 2011), pp. 1061–1070. Association for Computing Machinery (2011)
Heerschop, B., van Iterson, P., Hogenboom, A., Frasincar, F., Kaymak, U.: Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation. In: 7th Atlantic Web Intelligence Conference (AWIC 2011), pp. 195–205. Springer (2011)
Hogenboom, A., Hogenboom, F., Kaymak, U., Wouters, P., de Jong, F.: Mining Economic Sentiment using Argumentation Structures. In: Trujillo, J., Dobbie, G., Kangassalo, H., Hartmann, S., Kirchberg, M., Rossi, M., Reinhartz-Berger, I., Zimányi, E., Frasincar, F. (eds.) ER 2010. LNCS, vol. 6413, pp. 200–209. Springer, Heidelberg (2010)
Hogenboom, A., van Iterson, P., Heerschop, B., Frasincar, F., Kaymak, U.: Determining Negation Scope and Strength in Sentiment Analysis. In: 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2011), pp. 2589–2594. IEEE (2011)
Jansen, B., Zhang, M., Sobel, K., Chowdury, A.: Twitter Power: Tweets as Electronic Word of Mouth. Journal of the American Society for Information Science and Technology 60(11), 2169–2188 (2009)
Jindal, N., Liu, B.: Opinion Spam and Analysis. In: 1st ACM International Conference on Web Search and Data Mining (WSDM 2008), pp. 219–230. Association for Computing Machinery (2008)
Melville, P., Sindhwani, V., Lawrence, R.: Social Media Analytics: Channeling the Power of the Blogosphere for Marketing Insight. In: 1st Workshop on Information in Networks, WIN 2009 (2009)
Paltoglou, G., Thelwall, M.: A study of Information Retrieval weighting schemes for sentiment analysis. In: 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), pp. 1386–1395. Association for Computational Linguistics (2010)
Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), pp. 271–280. Association for Computational Linguistics (2004)
Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2(1), 1–135 (2008)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 79–86. Association for Computational Linguistics (2002)
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics 37(2), 267–307 (2011)
Taboada, M., Voll, K., Brooke, J.: Extracting Sentiment as a Function of Discourse Structure and Topicality. Tech. Rep. 20. Simon Fraser University (2008), http://www.cs.sfu.ca/research/publications/techreports/#2008
van der Meer, J., Boon, F., Hogenboom, F., Frasincar, F., Kaymak, U.: A Framework for Automatic Annotation of Web Pages Using the Google Rich Snippets Vocabulary. In: Twenty-Sixth Symposium On Applied Computing (SAC 2011), Web Technologies Track, pp. 765–772. Association for Computing Machinery (2012)
Whitelaw, C., Garg, N., Argamon, S.: Using Appraisal Groups for Sentiment Analysis. In: 14th ACM International Conference on Information and Knowledge Management (CIKM 2005), pp. 625–631. Association for Computing Machinery (2005)
Wiebe, J., Wilson, T., Bruce, R., Bell, M., Martin, M.: Learning Subjective Language. Computational Linguistics 30(3), 277–308 (2004)
Wiebe, J., Wilson, T., Cardie, C.: Annotating Expressions of Opinions and Emotions in Language. Language Resources and Evaluation 39(2), 165–210 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hogenboom, A., Boon, F., Frasincar, F. (2012). A Statistical Approach to Star Rating Classification of Sentiment. In: Casillas, J., Martínez-López, F., Corchado Rodríguez, J. (eds) Management Intelligent Systems. Advances in Intelligent Systems and Computing, vol 171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30864-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-30864-2_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30863-5
Online ISBN: 978-3-642-30864-2
eBook Packages: EngineeringEngineering (R0)