[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1699571.1699591dlproceedingsArticle/Chapter ViewAbstractPublication PagesemnlpConference Proceedingsconference-collections
research-article
Free access

Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus

Published: 06 August 2009 Publication History

Abstract

Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, and usually rely on significant manual annotation and large corpora. Most of these methods use WordNet. In contrast, we propose a simple approach to generate a high-coverage semantic orientation lexicon, which includes both individual words and multi-word expressions, using only a Roget-like thesaurus and a handful of affixes. Further, the lexicon has properties that support the Polyanna Hypothesis. Using the General Inquirer as gold standard, we show that our lexicon has 14 percentage points more correct entries than the leading WordNet-based high-coverage lexicon (SentiWordNet). In an extrinsic evaluation, we obtain significantly higher performance in determining phrase polarity using our thesaurus-based lexicon than with any other. Additionally, we explore the use of visualization techniques to gain insight into the our algorithm beyond the evaluations mentioned above.

References

[1]
Saima Aman and Stan Szpakowicz. 2007. Identifying expressions of emotion in text. Text, Speech and Dialogue, 4629:196--205.
[2]
Alina Andreevskaia and Sabine Bergler. 2006. Mining WordNet for fuzzy sentiment: Sentiment tag extraction from WordNet glosses. In Proceedings of the EACL, Trento, Italy.
[3]
Edwin Battistella. 1990. Markedness: The Evaluative Superstructure of Language. State University of New York Press, Albany, New York.
[4]
John R. L. Bernard, editor. 1986. The Macquarie Thesaurus. Macquarie Library, Sydney, Australia.
[5]
Jerry D. Boucher and Charles E. Osgood. 1969. The pollyanna hypothesis. Journal of Verbal Learning and Verbal Behaviour, 8:1--8.
[6]
Yejin Choi and Claire Cardie. 2008. Learning with compositional semantics as structural inference for subsentential sentiment analysis. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP-2008), Waikiki, Hawaii.
[7]
Andrea Esuli and Fabrizio Sebastiani. 2006. Senti-WordNet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, pages 417--422, Genoa, Italy.
[8]
Andrea Esuli. 2008. Automatic Generation of Lexical Resources for Opinion Mining: Models, Algorithms and Applications. Ph.D. thesis, Department of Information Engineering, University of Pisa, Pisa, Italy.
[9]
Thomas M. J. Fruchterman and Edward M. Reingold. 1991. Graph drawing by force-directed placement. Software: Practice and Experience, 21(11): 1129--1164.
[10]
Gregory Grefenstette, Yan Qu, David Evans, and James Shanahan. 2004. Validating the coverage of lexical resources for affect analysis and automatically classifying new words along semantic axes. In James Shanahan Yan Qu and Janyce Wiebe, editors, Exploring Attitude and Affect in Text: Theories and Applications, AAAI-2004 Spring Symposium Series, pages 71--78, San Jose, California.
[11]
Vasileios Hatzivassiloglou and Kathleen McKeown. 1997. Predicting the semantic orientation of adjectives. In Proceedings of EACL, pages 174--181, Madrid, Spain.
[12]
Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of ACM SIGKDD International ConferenceDiscovery and Data Mining (KDD-04), Seattle, WA.
[13]
Mario Jarmasz and Stan Szpakowicz. 2003. Roget's Thesaurus and semantic similarity. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-2003), pages 212--219, Borovets, Bulgaria.
[14]
Jaap Kamps, Maarten Marx, Robert J. Mokken, and Maarten de Rijke. 2004. Using WordNet to measure semantic orientation of adjectives. In LREC.
[15]
Hiroshi Kanayama and Tetsuya Nasukawa. 2006. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 355--363, Sydney, Australia, July. Association for Computational Linguistics.
[16]
Michael H. Kelly. 2000. Naming on the bright side of life, volume 48, pages 3--26.
[17]
Adrienne Lehrer. 1974. Semantic fields and lexical structure. North-Holland; American Elsevier, Amsterdam and New York.
[18]
Lucian Vlad Lita, Andrew Hazen Schlaikjer, WeiChang Hong, and Eric Nyberg. 2005. Qualitative dimensions in question answering: Extending the definitional QA task. In Proceedings of AAAI, pages 1616--1617. Student abstract.
[19]
John Lyons. 1977. Semantics, volume 1. Cambridge University Press.
[20]
Saif Mohammad and Graeme Hirst. 2006. Distributional measures of concept-distance: A task-oriented evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2006), pages 35--43, Sydney, Australia.
[21]
Saif Mohammad, Bonnie Dorr, Melissa Egan, Jimmy Lin, and David Zajic. 2008a. Multiple alternative sentence compressions and word-pair antonymy for automatic text summarization and recognizing textual entailment. In Proceedings of the Text Analysis Conference (TAC-2008), Gaithersburg, MD.
[22]
Saif Mohammad, Bonnie Dorr, and Graeme Hirst. 2008b. Computing word-pair antonymy. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Waikiki, Hawaii.
[23]
Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1--2): 1--135.
[24]
Yohei Seki, Koji Eguchi, and Noriko Kando. 2004. Analysis of multi-document viewpoint summarization using multi-dimensional genres. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications, pages 142--145.
[25]
Marc Smith, Ben Shneiderman, Natasa Milic-Frayling, Eduarda Mendes Rodrigues, Vladimir Barash, Cody Dunne, Tony Capone, Adam Perer, and Eric Gleave. 2009. Analyzing (social media) networks with NodeXL. In C&T '09: Proc. Fourth International Conference on Communities and Technologies, Lecture Notes in Computer Science. Springer.
[26]
Swapna Somasundaran, Theresa Wilson, Janyce Wiebe, and Veselin Stoyanov. 2007. QA with attitude: Exploiting opinion type analysis for improving question answering in on-line discussions and the news. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM).
[27]
Philip Stone, Dexter Dunphy, Marshall Smith, and Daniel Ogilvie. 1966. The General Inquirer: A Computer Approach to Content Analysis. MIT.
[28]
Carlo Strapparava and Alessandro Valitutti. 2004. WordNet-affect: and affective extension of Word-Net. In Proceedings of LREC, Lisbon, Portugal.
[29]
Hiroya Takamura, Takashi Inui, and Manabu Okumura. 2005. Extracting semantic orientation of words using spin model. In Proceedings of the Association for Computational Linguistics (ACL), pages 133--140.
[30]
Junichi Tatemura. 2000. Virtual reviewers for collaborative exploration of movie reviews. In Proceedings of Intelligent User Interfaces (IUI), pages 272--275.
[31]
Loren Terveen, Will Hill, Brian Amento, David McDonald, and Josh Creter. 1997. PHOAKS: A system for sharing recommendations. Communications of the Association for Computing Machinery (CACM), 40(3):59--62.
[32]
Peter Turney and Michael Littman. 2003. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315--346.
[33]
Peter Turney. 2002. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of ACL, pages 417--424, Philadelphia, Pennsylvania.
[34]
Janyce M. Wiebe. 1994. Tracking point of view in narrative. Computational Linguistics, 20(2):233--287.
[35]
Theresa Wilson, Janyce Wiebe, and Paul Hoffman. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of HLT-EMNLP, pages 347--354, Vancouver, Canada.
[36]
Hong Yu and Vassileios Hatzivassiloglou. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of EMNLP, pages 129--136, Morristown, NJ.

Cited By

View all
  • (2023)Multimodal Sentiment Analysis: A Survey of Methods, Trends, and ChallengesACM Computing Surveys10.1145/358607555:13s(1-38)Online publication date: 13-Jul-2023
  • (2019)Sparse Self-Attention LSTM for Sentiment Lexicon ConstructionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2019.293332627:11(1777-1790)Online publication date: 1-Nov-2019
  • (2017)Engagement with artificial intelligence through natural interaction modelsProceedings of the conference on Electronic Visualisation and the Arts10.14236/ewic/EVA2017.60(296-303)Online publication date: 10-Jul-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
August 2009
616 pages
ISBN:9781932432626

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 06 August 2009

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 73 of 234 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)9
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Multimodal Sentiment Analysis: A Survey of Methods, Trends, and ChallengesACM Computing Surveys10.1145/358607555:13s(1-38)Online publication date: 13-Jul-2023
  • (2019)Sparse Self-Attention LSTM for Sentiment Lexicon ConstructionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2019.293332627:11(1777-1790)Online publication date: 1-Nov-2019
  • (2017)Engagement with artificial intelligence through natural interaction modelsProceedings of the conference on Electronic Visualisation and the Arts10.14236/ewic/EVA2017.60(296-303)Online publication date: 10-Jul-2017
  • (2016)A multi-granularity fuzzy computing model for sentiment classification of Chinese reviewsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/IFS-15185330:3(1445-1460)Online publication date: 1-Jan-2016
  • (2016)Lexicon-Based Sentiment Analysis of Teachers’ EvaluationApplied Computational Intelligence and Soft Computing10.1155/2016/23854292016(1)Online publication date: 1-Oct-2016
  • (2016)Automatic Construction and Global Optimization of a Multisentiment LexiconComputational Intelligence and Neuroscience10.1155/2016/20934062016(2)Online publication date: 1-Nov-2016
  • (2016)Towards detection of influential sentences affecting reputation in wikipediaProceedings of the 8th ACM Conference on Web Science10.1145/2908131.2908177(244-248)Online publication date: 22-May-2016
  • (2016)Stock market sentiment lexicon acquisition using microblogging data and statistical measuresDecision Support Systems10.1016/j.dss.2016.02.01385:C(62-73)Online publication date: 1-May-2016
  • (2015)A fuzzy computing model for identifying polarity of chinese sentiment wordsComputational Intelligence and Neuroscience10.1155/2015/5254372015(47-47)Online publication date: 1-Jan-2015
  • (2015)A word-emoticon mutual reinforcement ranking model for building sentiment lexicon from massive collection of microblogsWorld Wide Web10.1007/s11280-014-0289-x18:4(949-967)Online publication date: 1-Jul-2015
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media