More Web Proxy on the site http://driver.im/

poster

Two birds with one stone: learning semantic models for text categorization and word sense disambiguation

Authors:

Roberto Navigli,

Stefano Faralli,

Oier de Lacalle,

Eneko AgirreAuthors Info & Claims

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Pages 2317 - 2320

https://doi.org/10.1145/2063576.2063955

Published: 24 October 2011 Publication History

Abstract

In this paper we present a novel approach to learning semantic models for multiple domains, which we use to categorize Wikipedia pages and to perform domain Word Sense Disambiguation (WSD). In order to learn a semantic model for each domain we first extract relevant terms from the texts in the domain and then use these terms to initialize a random walk over the WordNet graph. Given an input text, we check the semantic models, choose the appropriate domain for that text and use the best-matching model to perform WSD. Our results show considerable improvements on text categorization and domain WSD tasks.

References

[1]

E. Agirre, O. L. de Lacalle, and A. Soroa. Knowledge-based WSD on specific domains: performing better than generic supervised WSD. In Proc. of IJCAI 2009, pages 1501--1506, Pasadena, California, USA, 2009.

Digital Library

[2]

E. Agirre and O. Lopez de Lacalle. Supervised domain adaption for WSD. In Proc. of EACL 2009, pages 42--50, Athens, Greece, 2009.

Digital Library

[3]

C. Fellbaum, editor. WordNet: An Electronic Database. MIT Press, Cambridge, MA, 1998.

[4]

E. Gabrilovich and S. Markovitch. Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research, 34:443--498, March 2009.

[5]

A. Gliozzo, C. Strapparava, and I. Dagan. Unsupervised and supervised exploitation of semantic domains in lexical disambiguation. Computer Speech and Language, 18(3):275--299, 2004.

[6]

T. Joachims. Learning to Classify Text Using Support Vector Machines -- Methods, Theory, and Algorithms. Kluwer/Springer, 2002.

Digital Library

[7]

M. Khapra, A. Kulkarni, S. Sohoney, and P. Bhattacharyya. All words domain adapted WSD: Finding a middle ground between supervision and unsupervision. In Proc. of ACL 2010, pages 1532--1541, Uppsala, Sweden, July 2010.

Digital Library

[8]

R. Koeling, D. McCarthy, and J. Carroll. Domain-specific sense distributions and predominant sense acquisition. In Proc. of HLT-EMNLP 2005, pages 419--426, Vancouver, Canada, 2005.

Digital Library

[9]

D. McCarthy, R. Koeling, J. Weeds, and J. Carroll. Unsupervised acquisition of predominant word senses. Computational Linguistics, 33(4):553--590, 2007.

Digital Library

[10]

S. Mohammad and G. Hirst. Determining word sense dominance using a thesaurus. In Proc. of EACL 2006, pages 121--128, Trento, Italy, 2006.

[11]

R. Navigli. Word Sense Disambiguation: A survey. ACM Computing Surveys, 41(2):1--69, 2009.

Digital Library

[12]

S. P. Ponzetto and R. Navigli. Knowledge-rich Word Sense Disambiguation rivaling supervised system. In Proc. of ACL 2010, pages 1522--1531, Sweden, 2010.

Digital Library

[13]

C. Strapparava, A. Gliozzo, and C. Giuliano. Pattern abstraction and term similarity for word sense disambiguation: Irst at senseval-3. In Proc. of Senseval-3, pages 229--234, Barcelona, Spain, 2004.

[14]

L. Urena-López, M. de Buenaga Rodríguez, and J. Gómez. Integrating linguistic resources in TC through WSD. Computers and the Humanities, 35(2):215--230, 2001.

[15]

P. Wang and C. Domeniconi. Building semantic kernels for text classification using wikipedia. In Proc. of KDD 2008, pages 713--721, Nevada, 2008.

Digital Library

[16]

P. Wang, J. Hu, H.-J. Zeng, and Z. Chen. Using wikipedia knowledge to improve text classification. Knowledge Information Systems, 19(3):265--281, 2009.

Digital Library

Cited By

Hosseini MHosseini MJavidan R(2024)Leveraging Large Language Models for Clinical Abbreviation DisambiguationJournal of Medical Systems10.1007/s10916-024-02049-z48:1Online publication date: 27-Feb-2024
https://doi.org/10.1007/s10916-024-02049-z
Purohit AYogi K(2022)A Comparative Study of Existing Knowledge Based Techniques for Word Sense DisambiguationProceedings of International Joint Conference on Advances in Computational Intelligence10.1007/978-981-19-0332-8_12(167-182)Online publication date: 19-May-2022
https://doi.org/10.1007/978-981-19-0332-8_12
Hossain BSalam ASchwitter R(2020)A survey on automatically constructed universal knowledge basesJournal of Information Science10.1177/0165551520921342(016555152092134)Online publication date: 4-Jun-2020
https://doi.org/10.1177/0165551520921342
Show More Cited By

Index Terms

Two birds with one stone: learning semantic models for text categorization and word sense disambiguation
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Word sense disambiguation (WSD) is meant to assign the most appropriate sense to a polysemous word according to its context. We present a method for automatic WSD using only two resources: a raw text corpus and a machine-readable dictionary (MRD). The ...
Disambiguation of Homograms in a Pitch Accent Language
CSAI '17: Proceedings of the 2017 International Conference on Computer Science and Artificial Intelligence

The Croatian language is a pitch-accent language in which the tone contour realized in the stressed syllable carries the lexical information. Therefore, in some cases, different lexical accent gives the word a different meaning. In such cases, the ...
Unsupervised translated word sense disambiguation in constructing bilingual lexical database
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

The performance of a machine translation system depends on the availability of bilingual lexical dictionary and completion of its word sense disambiguation performance. Word sense disambiguation plays a vital role in several applications such as machine ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

October 2011

2712 pages

ISBN:9781450307178

DOI:10.1145/2063576

Editors:
Bettina Berendt,
Arjen de Vries,
Wenfei Fan,
Craig Macdonald
University of Glasgow, UK
,
Iadh Ounis
University of Glasgow, UK
,
Ian Ruthven
University of Strathclyde, UK

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

CIKM '11

Sponsor:

CIKM '11: International Conference on Information and Knowledge Management

October 24 - 28, 2011

Glasgow, Scotland, UK

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
457
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)2

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hosseini MHosseini MJavidan R(2024)Leveraging Large Language Models for Clinical Abbreviation DisambiguationJournal of Medical Systems10.1007/s10916-024-02049-z48:1Online publication date: 27-Feb-2024
https://doi.org/10.1007/s10916-024-02049-z
Purohit AYogi K(2022)A Comparative Study of Existing Knowledge Based Techniques for Word Sense DisambiguationProceedings of International Joint Conference on Advances in Computational Intelligence10.1007/978-981-19-0332-8_12(167-182)Online publication date: 19-May-2022
https://doi.org/10.1007/978-981-19-0332-8_12
Hossain BSalam ASchwitter R(2020)A survey on automatically constructed universal knowledge basesJournal of Information Science10.1177/0165551520921342(016555152092134)Online publication date: 4-Jun-2020
https://doi.org/10.1177/0165551520921342
Farahani YJanfada BBidgoli B(2020)A Review of Algorithms, Datasets, and Criteria in Word Sense Disambiguation With a View to its Use in Islamic Texts2020 8th Iranian Joint Congress on Fuzzy and intelligent Systems (CFIS)10.1109/CFIS49607.2020.9238679(172-179)Online publication date: Sep-2020
https://doi.org/10.1109/CFIS49607.2020.9238679
Li ZYang FLuo Y(2019)Context Embedding Based on Bi-LSTM in Semi-Supervised Biomedical Word Sense DisambiguationIEEE Access10.1109/ACCESS.2019.29125847(72928-72935)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2912584
Revenko AMireles V(2019)The Use of Class Assertions and Hypernyms to Induce and Disambiguate Word SensesDatabase and Expert Systems Applications10.1007/978-3-030-27684-3_22(172-181)Online publication date: 1-Aug-2019
https://doi.org/10.1007/978-3-030-27684-3_22
Bekkali MLachkar ABerrado ABakkoury ZBouchon-Meunier BRamdani M(2018)Context-based Arabic Word Sense Disambiguation using Short Text Similarity MeasureProceedings of the 12th International Conference on Intelligent Systems: Theories and Applications10.1145/3289402.3289544(1-6)Online publication date: 24-Oct-2018
https://dl.acm.org/doi/10.1145/3289402.3289544
HaCohen-Kerner YRosenfeld ASabag ATzidkani M(2018)Topic-based Classification through Unigram UnmaskingProcedia Computer Science10.1016/j.procs.2018.07.210126(69-76)Online publication date: 2018
https://doi.org/10.1016/j.procs.2018.07.210
Rago AMarcos CDiaz-Pace J(2018)Using semantic roles to improve text classification in the requirements domainLanguage Resources and Evaluation10.1007/s10579-017-9406-752:3(801-837)Online publication date: 1-Sep-2018
https://dl.acm.org/doi/10.1007/s10579-017-9406-7
Jimeno Yepes A(2017)Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguationJournal of Biomedical Informatics10.1016/j.jbi.2017.08.00173:C(137-147)Online publication date: 1-Sep-2017
https://dl.acm.org/doi/10.1016/j.jbi.2017.08.001
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents