More Web Proxy on the site http://driver.im/

research-article

Quality of sentiment analysis tools: the reasons of inconsistency

Authors:

Wissam Mammar Kouadri,

Salima Benbernou,

Karima Echihabi,

Themis Palpanas,

Iheb Ben AmorAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 14, Issue 4

Pages 668 - 681

https://doi.org/10.14778/3436905.3436924

Published: 01 December 2020 Publication History

Abstract

In this paper, we present a comprehensive study that evaluates six state-of-the-art sentiment analysis tools on five public datasets, based on the quality of predictive results in the presence of semantically equivalent documents, i.e., how consistent existing tools are in predicting the polarity of documents based on paraphrased text. We observe that sentiment analysis tools exhibit intra-tool inconsistency, which is the prediction of different polarity for semantically equivalent documents by the same tool, and inter-tool inconsistency, which is the prediction of different polarity for semantically equivalent documents across different tools. We introduce a heuristic to assess the data quality of an augmented dataset and a new set of metrics to evaluate tool inconsistencies. Our results indicate that tool inconsistencies is still an open problem, and they point towards promising research directions and accuracy improvements that can be obtained if such inconsistencies are resolved.

References

[1]

M. Alzantot, Y. Sharma, A. Elgohary, B.-J. Ho, M. Srivastava, and K.-W. Chang. Generating natural language adversarial examples. arXiv preprint arXiv:1804.07998, 2018.

[2]

S. Amer-Yahia, T. Palpanas, M. Tsytsarau, S. Kleisarchaki, A. Douzal, and V. Christophides. Temporal analytics in social media. In Encyclopedia of Database Systems, Second Edition. Springer, 2018.

[3]

M. Balduini, E. D. Valle, D. Dell'Aglio, M. Tsytsarau, T. Palpanas, and C. Confalonieri. Social listening of city scale events using the streaming linked data framework. In ISWC, 2013.

Digital Library

[4]

M. Bautin, L. Vijayarenu, and S. Skiena. International sentiment analysis for news and blogs. In ICWSM, 2008.

[5]

S. Benbernou and M. Ouziri. Enhancing data quality by cleaning inconsistent big RDF data. In 2017 IEEE International Conference on Big Data, BigData 2017, Boston, MA, USA, December 11-14, 2017, pages 74--79, 2017.

[6]

L. E. Bertossi. Inconsistent databases. In Encyclopedia of Database Systems, Second Edition. Springer, 2018.

[7]

J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th annual meeting of the association of computational linguistics, pages 440--447, 2007.

[8]

E. Cambria, S. Poria, D. Hazarika, and K. Kwok. Senticnet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In Proceedings of AAAI, 2018.

[9]

N. Carlini and D. Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 3--14. ACM, 2017.

Digital Library

[10]

B. Chen and X. Zhu. Bilingual sentiment consistency for statistical machine translation. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 607--615, 2014.

Digital Library

[11]

T. Chen, R. Xu, Y. He, and X. Wang. Improving sentiment analysis via sentence type classification using bilstm-crf and cnn. Expert Systems with Applications, 72:221--230, 2017.

Digital Library

[12]

Y. Choi and J. Wiebe. +/-effectwordnet: Sense-level lexicon acquisition for opinion inference. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1181--1191, 2014.

[13]

Y. Choi, J. Wiebe, and R. Mihalcea. Coarse-grained+/-effect word sense disambiguation for implicit sentiment analysis. IEEE Transactions on Affective Computing, 8(4):471--479, 2017.

[14]

K. Cortis, A. Freitas, T. Daudert, M. Huerlimann, M. Zarrouk, S. Handschuh, and B. Davis. Semeval-2017 task 5: Fine-grained sentiment analysis on financial microblogs and news. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 519--535, 2017.

[15]

N. N. Dalvi, A. Machanavajjhala, and B. Pang. An analysis of structured data on the web. Proc. VLDB Endow., 5(7):680--691, 2012.

Digital Library

[16]

G. Demartini and S. Siersdorfer. Dear search engine: what's your opinion about…?: sentiment analysis for semantic enrichment of web search results. In Proceedings of the 3rd International Semantic Search Workshop, page 4. ACM, 2010.

Digital Library

[17]

K. Denecke. Using sentiwordnet for multilingual sentiment analysis. In 2008 IEEE 24th International Conference on Data Engineering Workshop, pages 507--512. IEEE, 2008.

[18]

H. Ding and E. Riloff. Acquiring knowledge of affective events from blogs using label propagation. In Thirtieth AAAI Conference on Artificial Intelligence, 2016.

Digital Library

[19]

H. Ding and E. Riloff. Weakly supervised induction of affective events by optimizing semantic consistency. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

[20]

X. L. Dong and D. Srivastava. Entity resolution. In Encyclopedia of Database Systems, Second Edition. Springer, 2018.

[21]

C. Dos Santos and M. Gatti. Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pages 69--78, 2014.

[22]

M. Dragoni and G. Petrucci. A fuzzy-based strategy for multi-domain sentiment analysis. International Journal of Approximate Reasoning, 93:59--73, 2018.

Digital Library

[23]

E. C. Dragut, H. Wang, P. Sistla, C. Yu, and W. Meng. Polarity consistency checking for domain independent sentiment dictionaries. IEEE Transactions on knowledge and data engineering, 27(3):838--851, 2015.

[24]

A. Drutsa, V. Fedorova, D. Ustalov, O. Megorskaya, E. Zerminova, and D. Baidakova. Crowdsourcing practice for efficient data labeling: Aggregation, incremental relabeling, and pricing. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 2623--2627, 2020.

Digital Library

[25]

A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In LREC, volume 6, pages 417--422. Citeseer, 2006.

[26]

R. Feldman. Techniques and applications for sentiment analysis. Communications of the ACM, 56(4):82--89, 2013.

Digital Library

[27]

X. Feng, Y. Zeng, and Y. Xu. Recommendation algorithm for federated user reviews and item reviews. In Proceedings of the 2018 International Conference on Artificial Intelligence and Virtual Reality, pages 97--103. ACM, 2018.

Digital Library

[28]

G. Fu, Y. He, J. Song, and C. Wang. Improving chinese sentence polarity classification via opinion paraphrasing. In Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, pages 35--42, 2014.

[29]

C. H. E. Gilbert. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth International Conference on Weblogs and Social Media (ICWSM-14), 2014.

[30]

A. Go, R. Bhayani, and L. Huang. Twitter sentiment classification using distant supervision. CS224N project report, Stanford, 1(12):2009, 2009.

[31]

S. Greene and P. Resnik. More than words: Syntactic packaging and implicit sentiment. In Proceedings of human language technologies: The 2009 annual conference of the north american chapter of the association for computational linguistics, pages 503--511. Association for Computational Linguistics, 2009.

Digital Library

[32]

H. Hamdan, F. Béchet, and P. Bellot. Experiments with dbpedia, wordnet and sentiwordnet as resources for sentiment analysis in micro-blogging. In Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 455--459, 2013.

[33]

R. He and J. McAuley. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web, pages 507--517. International World Wide Web Conferences Steering Committee, 2016.

Digital Library

[34]

M. Iyyer, V. Manjunatha, J. Boyd-Graber, and H. Daumé III. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), volume 1, pages 1681--1691, 2015.

[35]

M. Iyyer, J. Wieting, K. Gimpel, and L. Zettlemoyer. Adversarial example generation with syntactically controlled paraphrase networks. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1875--1885, 2018.

[36]

X. Ji. Social data integration and analytics for health intelligence. In Proceedings VLDB PhD Workshop, 2014.

[37]

Y. Kim. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, 2014.

[38]

F. Kokkinos and A. Potamianos. Structural attention neural networks for improved sentiment analysis. arXiv preprint arXiv:1701.01811, 2017.

[39]

E. Kouloumpis, T. Wilson, and J. D. Moore. Twitter sentiment analysis: The good the bad and the omg! Icwsm, 11(538-541):164, 2011.

[40]

E. Krivosheev, S. Bykau, F. Casati, and S. Prabhakar. Detecting and preventing confused labels in crowdsourced data. Proceedings of the VLDB Endowment, 13(12):2522--2535, 2020.

Digital Library

[41]

F. M. Kundi, S. Ahmad, A. Khan, and M. Z. Asghar. Detection and scoring of internet slangs for sentiment analysis using sentiwordnet. Life Science Journal, 11(9):66--72, 2014.

[42]

M. Kusner, Y. Sun, N. Kolkin, and K. Weinberger. From word embeddings to document distances. In International conference on machine learning, pages 957--966, 2015.

Digital Library

[43]

S. Lai, L. Xu, K. Liu, and J. Zhao. Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence, 2015.

Digital Library

[44]

B. Liang, H. Li, M. Su, P. Bian, X. Li, and W. Shi. Deep text classification can be fooled. arXiv preprint arXiv:1704.08006, 2017.

Digital Library

[45]

B. Liu and L. Zhang. A survey of opinion mining and sentiment analysis. In Mining text data, pages 415--463. Springer, 2012.

[46]

L. Luo, X. Ao, F. Pan, J. Wang, T. Zhao, N. Yu, and Q. He. Beyond polarity: Interpretable financial sentiment analysis with hierarchical query-driven attention. In IJCAI, pages 4244--4250, 2018.

Digital Library

[47]

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. Learning word vectors for sentiment analysis. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, pages 142--150. Association for Computational Linguistics, 2011.

Digital Library

[48]

T. Mahler, W. Cheung, M. Elsner, D. King, M.-C. de Marneffe, C. Shain, S. Stevens-Guille, and M. White. Breaking nlp: Using morphosyntax, semantics, pragmatics and world knowledge to fool sentiment analysis systems. In Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems, pages 33--39, 2017.

[49]

A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, and R. C. Miller. Tweets as data: demonstration of tweeql and twitinfo. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 1259--1262, 2011.

Digital Library

[50]

J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the 7th ACM conference on Recommender systems, pages 165--172. ACM, 2013.

Digital Library

[51]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111--3119, 2013.

Digital Library

[52]

G. A. Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39--41, 1995.

Digital Library

[53]

T. Miyato, A. M. Dai, and I. Goodfellow. Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725, 2016.

[54]

B. Ohana and B. Tierney. Sentiment classification of reviews using sentiwordnet. In 9th. it & t conference, volume 13, pages 18--30, 2009.

[55]

J. Pennington, R. Socher, and C. Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532--1543, 2014.

[56]

A.-M. Popescu and M. Pennacchiotti. Detecting controversial events from twitter. In Proceedings of the 19th ACM international conference on Information and knowledge management, pages 1873--1876, 2010.

Digital Library

[57]

S. Poria, E. Cambria, and A. Gelbukh. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Proceedings of the 2015 conference on empirical methods in natural language processing, pages 2539--2544, 2015.

Digital Library

[58]

N. Prokoshyna, J. Szlichta, F. Chiang, R. J. Miller, and D. Srivastava. Combining quantitative and logical data cleaning. Proc. VLDB Endow., 9(4):300--311, 2015.

Digital Library

[59]

C. Quirk, C. Brockett, and W. B. Dolan. Monolingual machine translation for paraphrase generation. In Proceedings of the 2004 conference on empirical methods in natural language processing, pages 142--149, 2004.

[60]

A. Ratner, S. H. Bach, H. Ehrenberg, J. Fries, S. Wu, and C. Ré. Snorkel: Rapid training data creation with weak supervision. The VLDB Journal, 29(2):709--730, 2020.

Digital Library

[61]

M. T. Ribeiro, S. Singh, and C. Guestrin. Semantically equivalent adversarial rules for debugging nlp models. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 856--865, 2018.

[62]

J. Risch and R. Krestel. Aggression identification using deep learning and data augmentation. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 150--158, 2018.

[63]

H. Rong, V. S. Sheng, T. Ma, Y. Zhou, and M. A. Al-Rodhaan. A self-play and sentiment-emphasized comment integration framework based on deep q-learning in a crowdsourcing scenario. IEEE Transactions on Knowledge and Data Engineering, 2020.

[64]

K. Schouten and F. Frasincar. Survey on aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(3):813--830, 2015.

Digital Library

[65]

A. Severyn and A. Moschitti. Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 959--962. ACM, 2015.

Digital Library

[66]

R. Socher, A. Perelygin, J. Wu, J. Chuang, C. D. Manning, A. Ng, and C. Potts. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631--1642, 2013.

[67]

D. Tang, F. Wei, B. Qin, N. Yang, T. Liu, and M. Zhou. Sentiment embeddings with applications to sentiment analysis. IEEE transactions on knowledge and data Engineering, 28(2):496--509, 2015.

Digital Library

[68]

M. Tsytsarau, S. Amer-Yahia, and T. Palpanas. Efficient sentiment correlation for large-scale demographics. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 253--264, 2013.

Digital Library

[69]

M. Tsytsarau and T. Palpanas. Survey on mining subjective data on the web. Data Min. Knowl. Discov., 24(3):478--514, 2012.

Digital Library

[70]

M. Tsytsarau and T. Palpanas. Managing diverse sentiments at large scale. IEEE Transactions on Knowledge and Data Engineering, 28(11):3028--3040, 2016.

Digital Library

[71]

M. Tsytsarau, T. Palpanas, and M. Castellanos. Dynamics of news events and social media reaction. In KDD, 2014.

Digital Library

[72]

S. Vosoughi, P. Vijayaraghavan, and D. Roy. Tweet2vec: Learning tweet embeddings using character-level cnn-lstm encoder-decoder. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 1041--1044. ACM, 2016.

Digital Library

[73]

K. Wang and X. Wan. Sentigan: Generating sentimental texts via mixture adversarial networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden., pages 4446--4452, 2018.

Digital Library

[74]

W. Wang, J. Gao, M. Zhang, S. Wang, G. Chen, T. K. Ng, B. C. Ooi, J. Shao, and M. Reyad. Rafiki: Machine learning as an analytics service system. Proc. VLDB Endow., 12(2):128--140, 2018.

Digital Library

[75]

X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang. Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In Proceedings of the 20th ACM international conference on Information and knowledge management, pages 1031--1040, 2011.

Digital Library

[76]

Y. Wang, A. Sun, J. Han, Y. Liu, and X. Zhu. Sentiment analysis by capsules. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 1165--1174. International World Wide Web Conferences Steering Committee, 2018.

Digital Library

[77]

J. W. Wei and K. Zou. Eda: Easy data augmentation techniques for boosting performance on text classification tasks. arXiv preprint arXiv:1901.11196, 2019.

[78]

J. Wiebe, T. Wilson, and C. Cardie. Annotating expressions of opinions and emotions in language. Language resources and evaluation, 39(2-3):165--210, 2005.

[79]

B. Yang and C. Cardie. Context-aware learning for sentence-level sentiment analysis with posterior regularization. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), volume 1, pages 325--335, 2014.

[80]

K. Zhao, Y. Liu, Q. Yuan, L. Chen, Z. Chen, and G. Cong. Towards personalized maps: Mining user preferences from geo-textual data. Proc. VLDB Endow., 9(13):1545--1548, 2016.

Digital Library

[81]

Y. Zheng, G. Li, Y. Li, C. Shan, and R. Cheng. Truth inference in crowdsourcing: Is the problem solved? Proceedings of the VLDB Endowment, 10(5):541--552, 2017.

Digital Library

[82]

L. Zhu, A. Galstyan, J. Cheng, and K. Lerman. Tripartite graph clustering for dynamic sentiment analysis on social media. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pages 1531--1542, 2014.

Digital Library

Cited By

Maamar-Kouadri WBenbernou SOuziri MPalpanas TAmor I(2022)SA-QProceedings of the VLDB Endowment10.14778/3554821.355486815:12(3658-3661)Online publication date: 29-Sep-2022
https://dl.acm.org/doi/10.14778/3554821.3554868
Mammar Kouadri WBenbernou SOuziri MBen Amor I(2022)WSSA: Weakly Supervised Semantic-based approach for Sentiment AnalysisProceedings of the 34th International Conference on Scientific and Statistical Database Management10.1145/3538712.3538747(1-4)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3538712.3538747
Seki KIkuta YMatsubayashi Y(2022)News-based business sentiment and its properties as an economic indexInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10279559:2Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1016/j.ipm.2021.102795
Show More Cited By

Recommendations

Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Topic sentiment change analysis
MLDM'11: Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition

Public opinions on a topic may change over time. Topic Sentiment change analysis is a new research problem consisting of two main components: (a) mining opinions on a certain topic, and (b) detect significant changes of sentiment of the opinions on the ...
Sentiment analysis: what is the end user's requirement?
WIMS '12: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics

In this paper we address the Sentiment Analysis problem from the end user's perspective. An end user might desire an automated at-a-glance presentation of the main points made in a single review or how opinion changes time to time over multiple ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 14, Issue 4

December 2020

263 pages

ISSN:2150-8097

Editors:
Xin Luna Dong
Amazon
,
Felix Naumann
HPI, University of Potsdam

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 December 2020

Published in PVLDB Volume 14, Issue 4

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
243
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)3

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Maamar-Kouadri WBenbernou SOuziri MPalpanas TAmor I(2022)SA-QProceedings of the VLDB Endowment10.14778/3554821.355486815:12(3658-3661)Online publication date: 29-Sep-2022
https://dl.acm.org/doi/10.14778/3554821.3554868
Mammar Kouadri WBenbernou SOuziri MBen Amor I(2022)WSSA: Weakly Supervised Semantic-based approach for Sentiment AnalysisProceedings of the 34th International Conference on Scientific and Statistical Database Management10.1145/3538712.3538747(1-4)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3538712.3538747
Seki KIkuta YMatsubayashi Y(2022)News-based business sentiment and its properties as an economic indexInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10279559:2Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1016/j.ipm.2021.102795
Mercer RAlaee SAbdoli ASenobari NSingh SMurillo AKeogh E(2022)Introducing the contrast profile: a novel time series primitive that allows real world classificationData Mining and Knowledge Discovery10.1007/s10618-022-00824-536:2(877-915)Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1007/s10618-022-00824-5

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents