More Web Proxy on the site http://driver.im/

research-article

Context-based Arabic Word Sense Disambiguation using Short Text Similarity Measure

Authors:

Mohammed Bekkali,

Abdelmonaime LachkarAuthors Info & Claims

SITA'18: Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications

Article No.: 44, Pages 1 - 6

https://doi.org/10.1145/3289402.3289544

Published: 24 October 2018 Publication History

Abstract

Word Sense Disambiguation (WSD) is the process of determining which sense of a word is used in a given context. Most of Arabic WSD systems are based generally on the information extracted from the local context of the word to be disambiguated by computing the number of overlapping words between the two concepts definitions. This information is not usually sufficient for a best disambiguation. Because of the short nature of concept definition, we believe that exploiting semantic short text similarity measure can improve the identification process of which sense of a word is used in a context.

In this paper, we propose an efficient method for computing the semantic relatedness between senses. To this end, we reintroduce the Web-based Kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple possible concepts. The proposed method has been tested, evaluated and compared using an Arabic short text categorization system in term of the F1-measure. The obtained results show the interest of our proposition.

References

[1]

Abu-Hamdiyyah, Mohammad. 2000. The Qur'An: An Introduction

[2]

H. Froud, A. Lachkar, S. A. Ouatik. 2012. A Comparative Study of Root-Based and Stem-Based Approaches for measuring the Similarity between Arabic words For Arabic Text Mining Applications. Advanced Computing: An International Journal (ACIJ), Vol.3. No.6

[3]

M.E. Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a nice cream cone'. In Proceedings of the SIGDOC Conference, Toronto.

Digital Library

[4]

D. Yarowsky. 1992. Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92). 454--460.

Digital Library

[5]

Resnik P. 1999. Disambiguating Noun Groupings with Respect to WordNet Senses. In: Armstrong S., Church K., Isabelle P., Manzi S., Tzoukermann E., Yarowsky D. (eds) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol 11. Springer, Dordrecht

[6]

M. Diab, P. Resnik. 2002. An unsupervised method for word sense tagging using parallel corpora". in Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL-02), Philadelphia, PA, 255--262.

Digital Library

[7]

Navigli R, Faralli S, Soroa A, de Lacalle O, Agirre E. 2011. Two birds with one stone: learning semantic models for text categorization and word sense disambiguation. In: Proceedings of the 20th ACM international conference on Information and knowledge management. ACM. 24--28.

Digital Library

[8]

B. Snyder and M. Palmer. 2004. The English all-words task. In Proceedings of the 3rd ACL workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL), Barcelona, Spain.

[9]

S. Pradhan, E. Loper, D. Dligach, and M.Palmer. 2007. Semeval-2007 task-17: English lexical sample srl and all words. In Proceedings of SemEval-2007, 87--92.

Digital Library

[10]

E.F. Kelly, P.J. Stone. 1975. Computer recognition of English word senses. North- Holland Publishing. North-Holland, Amsterdam.

[11]

T. Pedersen. 1998. Learning Probabilistic Models of Word Sense Disambiguation. PhD thesis, Southern Methodist University, Dallas.

[12]

S. Elmougy, T. Hamza, H.M. Noaman. 2008. Naive Bayes classifier for Arabic word sense disambiguation. In: Proceedings of INFOS 2008, Cairo, pp 27--29.

[13]

H. Schutze. 1998. Automatic word sense discrimination. Computational Linguistics. Special Issue on Word Sense Disambiguation, 24 (1), 97--123.

Digital Library

[14]

Brody S, Lapata M. 2009. Bayesian word sense induction. In: Proceedings of the 12th conference of the European chapter of the association for computational linguistics, 103--11.

Digital Library

[15]

Sebastiani F. 2002. Machine learning in automated text categorization. ACM Computing Surveys, volume 34, number 1, 1--47.

Digital Library

[16]

A. Zouaghi, L. Merhbene, M. Zrigui. 2012. Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artif Intell Rev 38, 257--269.

Digital Library

[17]

M.E. Menai. 2014. Word sense disambiguation using evolutionary algorithms -- Application to Arabic language. Computers in Human Behavior 41, 92--103.

Digital Library

[18]

N. Bouhriz, F. Benabbou and E. H. Ben Lahmar. 2016. Word Sense Disambiguation Approach for Arabic Text. International Journal of Advanced Computer Science and Applications, Vol. 7, No. 4.

[19]

M. Hadni, S. El Alaoui, and A. Lachkar. 2016. Word Sense Disambiguation for Arabic Text Categorization. The International Arab Journal of Information Technology, Vol. 13, No. 1A.

[20]

Pawlak, Z 1991. Rough sets: Theoretical aspects of reasoning about data, Kluwer Dordrecht.

Digital Library

[21]

Jin Zhang and Shuxuan Chen. 2013. A study on clustering algorithm of Web search results based on rough set. Software Engineering and Service Science (ICSESS).

[22]

Ngo Chi Lang. 2003. A tolerance rough set approach to clustering web search results. Poland: Warsaw University

[23]

Sahami, M., and Heilman, T. 2006. A web-based kernel function for measuring the similarity of short text snippets. In Proc. of WWW '06.

Digital Library

[24]

R. Navigli and S. Ponzetto. 2012. BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network. Artificial Intelligence, 193, Elsevier, 217--250.

Digital Library

[25]

Bekkali M., Lachkar A. 2017. Web Search Engine-Based Representation for Arabic Tweets Categorization. In: Kaya M., Erdoğan Ö., Rokne J. (eds) From Social Data Mining and Analysis to Prediction and Community Detection. Lecture Notes in Social Networks. Springer

Cited By

Karuppaiah DVincent P(2021)Hybrid approach for semantic similarity calculation between Tamil wordsInternational Journal of Innovative Computing and Applications10.1504/ijica.2021.11360912:1(13-23)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1504/ijica.2021.113609

Index Terms

Context-based Arabic Word Sense Disambiguation using Short Text Similarity Measure
1. Information systems
  1. Information retrieval
    1. Document representation
  2. World Wide Web
    1. Web applications
      1. Social networks

Recommendations

Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Word sense disambiguation (WSD) is meant to assign the most appropriate sense to a polysemous word according to its context. We present a method for automatic WSD using only two resources: a raw text corpus and a machine-readable dictionary (MRD). The ...
Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and dictionary

Word sense disambiguation (WSD) is meant to assign the most appropriate sense to a polysemous word according to its context. We present a method for automatic WSD using only two resources: a raw text corpus and a machine-readable dictionary (MRD). The ...
The Contribution of Selected Linguistic Markers for Unsupervised Arabic Verb Sense Disambiguation
Word sense disambiguation (WSD) is the task of automatically determining the meaning of a polysemous word in a specific context. Word sense induction is the unsupervised clustering of word usages in a different context to distinguish senses and perform ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

SITA'18: Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications

October 2018

301 pages

ISBN:9781450364621

DOI:10.1145/3289402

Conference Chairs:
Abdelaziz Berrado,
Zohra Bakkoury,
Program Chairs:
Bernadette Bouchon-Meunier,
Mohammed Ramdani

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SITA'18

SITA'18: THEORIES AND APPLICATIONS

October 24 - 25, 2018

Rabat, Morocco

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
56
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Karuppaiah DVincent P(2021)Hybrid approach for semantic similarity calculation between Tamil wordsInternational Journal of Innovative Computing and Applications10.1504/ijica.2021.11360912:1(13-23)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1504/ijica.2021.113609

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents