More Web Proxy on the site http://driver.im/

research-article

Free access

Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus

Authors:

Bonnie DorrAuthors Info & Claims

EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2

Pages 599 - 608

Published: 06 August 2009 Publication History

Abstract

Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, and usually rely on significant manual annotation and large corpora. Most of these methods use WordNet. In contrast, we propose a simple approach to generate a high-coverage semantic orientation lexicon, which includes both individual words and multi-word expressions, using only a Roget-like thesaurus and a handful of affixes. Further, the lexicon has properties that support the Polyanna Hypothesis. Using the General Inquirer as gold standard, we show that our lexicon has 14 percentage points more correct entries than the leading WordNet-based high-coverage lexicon (SentiWordNet). In an extrinsic evaluation, we obtain significantly higher performance in determining phrase polarity using our thesaurus-based lexicon than with any other. Additionally, we explore the use of visualization techniques to gain insight into the our algorithm beyond the evaluations mentioned above.

References

[1]

Saima Aman and Stan Szpakowicz. 2007. Identifying expressions of emotion in text. Text, Speech and Dialogue, 4629:196--205.

Digital Library

[2]

Alina Andreevskaia and Sabine Bergler. 2006. Mining WordNet for fuzzy sentiment: Sentiment tag extraction from WordNet glosses. In Proceedings of the EACL, Trento, Italy.

[3]

Edwin Battistella. 1990. Markedness: The Evaluative Superstructure of Language. State University of New York Press, Albany, New York.

[4]

John R. L. Bernard, editor. 1986. The Macquarie Thesaurus. Macquarie Library, Sydney, Australia.

[5]

Jerry D. Boucher and Charles E. Osgood. 1969. The pollyanna hypothesis. Journal of Verbal Learning and Verbal Behaviour, 8:1--8.

[6]

Yejin Choi and Claire Cardie. 2008. Learning with compositional semantics as structural inference for subsentential sentiment analysis. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP-2008), Waikiki, Hawaii.

Digital Library

[7]

Andrea Esuli and Fabrizio Sebastiani. 2006. Senti-WordNet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, pages 417--422, Genoa, Italy.

[8]

Andrea Esuli. 2008. Automatic Generation of Lexical Resources for Opinion Mining: Models, Algorithms and Applications. Ph.D. thesis, Department of Information Engineering, University of Pisa, Pisa, Italy.

[9]

Thomas M. J. Fruchterman and Edward M. Reingold. 1991. Graph drawing by force-directed placement. Software: Practice and Experience, 21(11): 1129--1164.

Digital Library

[10]

Gregory Grefenstette, Yan Qu, David Evans, and James Shanahan. 2004. Validating the coverage of lexical resources for affect analysis and automatically classifying new words along semantic axes. In James Shanahan Yan Qu and Janyce Wiebe, editors, Exploring Attitude and Affect in Text: Theories and Applications, AAAI-2004 Spring Symposium Series, pages 71--78, San Jose, California.

[11]

Vasileios Hatzivassiloglou and Kathleen McKeown. 1997. Predicting the semantic orientation of adjectives. In Proceedings of EACL, pages 174--181, Madrid, Spain.

Digital Library

[12]

Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of ACM SIGKDD International ConferenceDiscovery and Data Mining (KDD-04), Seattle, WA.

Digital Library

[13]

Mario Jarmasz and Stan Szpakowicz. 2003. Roget's Thesaurus and semantic similarity. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-2003), pages 212--219, Borovets, Bulgaria.

[14]

Jaap Kamps, Maarten Marx, Robert J. Mokken, and Maarten de Rijke. 2004. Using WordNet to measure semantic orientation of adjectives. In LREC.

[15]

Hiroshi Kanayama and Tetsuya Nasukawa. 2006. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 355--363, Sydney, Australia, July. Association for Computational Linguistics.

Digital Library

[16]

Michael H. Kelly. 2000. Naming on the bright side of life, volume 48, pages 3--26.

[17]

Adrienne Lehrer. 1974. Semantic fields and lexical structure. North-Holland; American Elsevier, Amsterdam and New York.

[18]

Lucian Vlad Lita, Andrew Hazen Schlaikjer, WeiChang Hong, and Eric Nyberg. 2005. Qualitative dimensions in question answering: Extending the definitional QA task. In Proceedings of AAAI, pages 1616--1617. Student abstract.

Digital Library

[19]

John Lyons. 1977. Semantics, volume 1. Cambridge University Press.

[20]

Saif Mohammad and Graeme Hirst. 2006. Distributional measures of concept-distance: A task-oriented evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2006), pages 35--43, Sydney, Australia.

Digital Library

[21]

Saif Mohammad, Bonnie Dorr, Melissa Egan, Jimmy Lin, and David Zajic. 2008a. Multiple alternative sentence compressions and word-pair antonymy for automatic text summarization and recognizing textual entailment. In Proceedings of the Text Analysis Conference (TAC-2008), Gaithersburg, MD.

[22]

Saif Mohammad, Bonnie Dorr, and Graeme Hirst. 2008b. Computing word-pair antonymy. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Waikiki, Hawaii.

Digital Library

[23]

Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1--2): 1--135.

Digital Library

[24]

Yohei Seki, Koji Eguchi, and Noriko Kando. 2004. Analysis of multi-document viewpoint summarization using multi-dimensional genres. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications, pages 142--145.

[25]

Marc Smith, Ben Shneiderman, Natasa Milic-Frayling, Eduarda Mendes Rodrigues, Vladimir Barash, Cody Dunne, Tony Capone, Adam Perer, and Eric Gleave. 2009. Analyzing (social media) networks with NodeXL. In C&T '09: Proc. Fourth International Conference on Communities and Technologies, Lecture Notes in Computer Science. Springer.

Digital Library

[26]

Swapna Somasundaran, Theresa Wilson, Janyce Wiebe, and Veselin Stoyanov. 2007. QA with attitude: Exploiting opinion type analysis for improving question answering in on-line discussions and the news. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM).

[27]

Philip Stone, Dexter Dunphy, Marshall Smith, and Daniel Ogilvie. 1966. The General Inquirer: A Computer Approach to Content Analysis. MIT.

[28]

Carlo Strapparava and Alessandro Valitutti. 2004. WordNet-affect: and affective extension of Word-Net. In Proceedings of LREC, Lisbon, Portugal.

[29]

Hiroya Takamura, Takashi Inui, and Manabu Okumura. 2005. Extracting semantic orientation of words using spin model. In Proceedings of the Association for Computational Linguistics (ACL), pages 133--140.

Digital Library

[30]

Junichi Tatemura. 2000. Virtual reviewers for collaborative exploration of movie reviews. In Proceedings of Intelligent User Interfaces (IUI), pages 272--275.

Digital Library

[31]

Loren Terveen, Will Hill, Brian Amento, David McDonald, and Josh Creter. 1997. PHOAKS: A system for sharing recommendations. Communications of the Association for Computing Machinery (CACM), 40(3):59--62.

Digital Library

[32]

Peter Turney and Michael Littman. 2003. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315--346.

Digital Library

[33]

Peter Turney. 2002. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of ACL, pages 417--424, Philadelphia, Pennsylvania.

Digital Library

[34]

Janyce M. Wiebe. 1994. Tracking point of view in narrative. Computational Linguistics, 20(2):233--287.

Digital Library

[35]

Theresa Wilson, Janyce Wiebe, and Paul Hoffman. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of HLT-EMNLP, pages 347--354, Vancouver, Canada.

Digital Library

[36]

Hong Yu and Vassileios Hatzivassiloglou. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of EMNLP, pages 129--136, Morristown, NJ.

Digital Library

Cited By

Das RSingh T(2023)Multimodal Sentiment Analysis: A Survey of Methods, Trends, and ChallengesACM Computing Surveys10.1145/358607555:13s(1-38)Online publication date: 13-Jul-2023
https://dl.acm.org/doi/10.1145/3586075
Deng DJing LYu JSun S(2019)Sparse Self-Attention LSTM for Sentiment Lexicon ConstructionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2019.293332627:11(1777-1790)Online publication date: 1-Nov-2019
https://dl.acm.org/doi/10.1109/TASLP.2019.2933326
Feldman SYalcin ODiPaola S(2017)Engagement with artificial intelligence through natural interaction modelsProceedings of the conference on Electronic Visualisation and the Arts10.14236/ewic/EVA2017.60(296-303)Online publication date: 10-Jul-2017
https://dl.acm.org/doi/10.14236/ewic/EVA2017.60
Show More Cited By

Index Terms

Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
      2. Modeling methodologies
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection
      2. Thesauri

Recommendations

Generating semantic orientation lexicon using large data and thesaurus
WASSA '11: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis

We propose a novel method to construct semantic orientation lexicons using large data and a thesaurus. To deal with large data, we use Count-Min sketch to store the approximate counts of all word pairs in a bounded space of 8GB. We use a thesaurus (like ...
Identifying the semantic orientation of foreign words
HLT '11: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2

We present a method for identifying the positive or negative semantic orientation of foreign words. Identifying the semantic orientation of words has numerous applications in the areas of text classification, analysis of product review, analysis of ...
Semantic relations in bilingual lexicons

Bilingual lexicons, essential to many NLP applications, can be constructed automatically on the basis of parallel or comparable corpora. In this article, we make two contributions to their induction from comparable corpora. The first one concerns the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2

August 2009

616 pages

ISBN:9781932432626

Program Chairs:
Philipp Koehn
University of Edinburgh
,
Rada Mihalcea
University of North Texas

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 06 August 2009

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 73 of 234 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
805
Total Downloads

Downloads (Last 12 months)80
Downloads (Last 6 weeks)9

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Das RSingh T(2023)Multimodal Sentiment Analysis: A Survey of Methods, Trends, and ChallengesACM Computing Surveys10.1145/358607555:13s(1-38)Online publication date: 13-Jul-2023
https://dl.acm.org/doi/10.1145/3586075
Deng DJing LYu JSun S(2019)Sparse Self-Attention LSTM for Sentiment Lexicon ConstructionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2019.293332627:11(1777-1790)Online publication date: 1-Nov-2019
https://dl.acm.org/doi/10.1109/TASLP.2019.2933326
Feldman SYalcin ODiPaola S(2017)Engagement with artificial intelligence through natural interaction modelsProceedings of the conference on Electronic Visualisation and the Arts10.14236/ewic/EVA2017.60(296-303)Online publication date: 10-Jul-2017
https://dl.acm.org/doi/10.14236/ewic/EVA2017.60
Wang BHuang YYuan ZLi X(2016)A multi-granularity fuzzy computing model for sentiment classification of Chinese reviewsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/IFS-15185330:3(1445-1460)Online publication date: 1-Jan-2016
https://dl.acm.org/doi/10.3233/IFS-151853
Rajput QHaider SGhani S(2016)Lexicon-Based Sentiment Analysis of Teachers’ EvaluationApplied Computational Intelligence and Soft Computing10.1155/2016/23854292016(1)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1155/2016/2385429
Yang XZhang ZZhang ZMo YLi LYu LZhu P(2016)Automatic Construction and Global Optimization of a Multisentiment LexiconComputational Intelligence and Neuroscience10.1155/2016/20934062016(2)Online publication date: 1-Nov-2016
https://dl.acm.org/doi/10.1155/2016/2093406
Zhou YCristea ANejdl WHall WParigi PStaab S(2016)Towards detection of influential sentences affecting reputation in wikipediaProceedings of the 8th ACM Conference on Web Science10.1145/2908131.2908177(244-248)Online publication date: 22-May-2016
https://dl.acm.org/doi/10.1145/2908131.2908177
Oliveira NCortez PAreal N(2016)Stock market sentiment lexicon acquisition using microblogging data and statistical measuresDecision Support Systems10.1016/j.dss.2016.02.01385:C(62-73)Online publication date: 1-May-2016
https://dl.acm.org/doi/10.1016/j.dss.2016.02.013
Wang BHuang YWu XLi X(2015)A fuzzy computing model for identifying polarity of chinese sentiment wordsComputational Intelligence and Neuroscience10.1155/2015/5254372015(47-47)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1155/2015/525437
Feng SSong KWang DYu G(2015)A word-emoticon mutual reinforcement ranking model for building sentiment lexicon from massive collection of microblogsWorld Wide Web10.1007/s11280-014-0289-x18:4(949-967)Online publication date: 1-Jul-2015
https://dl.acm.org/doi/10.1007/s11280-014-0289-x
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents