[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3289402.3289544acmotherconferencesArticle/Chapter ViewAbstractPublication PagessitaConference Proceedingsconference-collections
research-article

Context-based Arabic Word Sense Disambiguation using Short Text Similarity Measure

Published: 24 October 2018 Publication History

Abstract

Word Sense Disambiguation (WSD) is the process of determining which sense of a word is used in a given context. Most of Arabic WSD systems are based generally on the information extracted from the local context of the word to be disambiguated by computing the number of overlapping words between the two concepts definitions. This information is not usually sufficient for a best disambiguation. Because of the short nature of concept definition, we believe that exploiting semantic short text similarity measure can improve the identification process of which sense of a word is used in a context.
In this paper, we propose an efficient method for computing the semantic relatedness between senses. To this end, we reintroduce the Web-based Kernel function for measuring the semantic relatedness between concepts to disambiguate an expression versus multiple possible concepts. The proposed method has been tested, evaluated and compared using an Arabic short text categorization system in term of the F1-measure. The obtained results show the interest of our proposition.

References

[1]
Abu-Hamdiyyah, Mohammad. 2000. The Qur'An: An Introduction
[2]
H. Froud, A. Lachkar, S. A. Ouatik. 2012. A Comparative Study of Root-Based and Stem-Based Approaches for measuring the Similarity between Arabic words For Arabic Text Mining Applications. Advanced Computing: An International Journal (ACIJ), Vol.3. No.6
[3]
M.E. Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a nice cream cone'. In Proceedings of the SIGDOC Conference, Toronto.
[4]
D. Yarowsky. 1992. Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92). 454--460.
[5]
Resnik P. 1999. Disambiguating Noun Groupings with Respect to WordNet Senses. In: Armstrong S., Church K., Isabelle P., Manzi S., Tzoukermann E., Yarowsky D. (eds) Natural Language Processing Using Very Large Corpora. Text, Speech and Language Technology, vol 11. Springer, Dordrecht
[6]
M. Diab, P. Resnik. 2002. An unsupervised method for word sense tagging using parallel corpora". in Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL-02), Philadelphia, PA, 255--262.
[7]
Navigli R, Faralli S, Soroa A, de Lacalle O, Agirre E. 2011. Two birds with one stone: learning semantic models for text categorization and word sense disambiguation. In: Proceedings of the 20th ACM international conference on Information and knowledge management. ACM. 24--28.
[8]
B. Snyder and M. Palmer. 2004. The English all-words task. In Proceedings of the 3rd ACL workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL), Barcelona, Spain.
[9]
S. Pradhan, E. Loper, D. Dligach, and M.Palmer. 2007. Semeval-2007 task-17: English lexical sample srl and all words. In Proceedings of SemEval-2007, 87--92.
[10]
E.F. Kelly, P.J. Stone. 1975. Computer recognition of English word senses. North- Holland Publishing. North-Holland, Amsterdam.
[11]
T. Pedersen. 1998. Learning Probabilistic Models of Word Sense Disambiguation. PhD thesis, Southern Methodist University, Dallas.
[12]
S. Elmougy, T. Hamza, H.M. Noaman. 2008. Naive Bayes classifier for Arabic word sense disambiguation. In: Proceedings of INFOS 2008, Cairo, pp 27--29.
[13]
H. Schutze. 1998. Automatic word sense discrimination. Computational Linguistics. Special Issue on Word Sense Disambiguation, 24 (1), 97--123.
[14]
Brody S, Lapata M. 2009. Bayesian word sense induction. In: Proceedings of the 12th conference of the European chapter of the association for computational linguistics, 103--11.
[15]
Sebastiani F. 2002. Machine learning in automated text categorization. ACM Computing Surveys, volume 34, number 1, 1--47.
[16]
A. Zouaghi, L. Merhbene, M. Zrigui. 2012. Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation. Artif Intell Rev 38, 257--269.
[17]
M.E. Menai. 2014. Word sense disambiguation using evolutionary algorithms -- Application to Arabic language. Computers in Human Behavior 41, 92--103.
[18]
N. Bouhriz, F. Benabbou and E. H. Ben Lahmar. 2016. Word Sense Disambiguation Approach for Arabic Text. International Journal of Advanced Computer Science and Applications, Vol. 7, No. 4.
[19]
M. Hadni, S. El Alaoui, and A. Lachkar. 2016. Word Sense Disambiguation for Arabic Text Categorization. The International Arab Journal of Information Technology, Vol. 13, No. 1A.
[20]
Pawlak, Z 1991. Rough sets: Theoretical aspects of reasoning about data, Kluwer Dordrecht.
[21]
Jin Zhang and Shuxuan Chen. 2013. A study on clustering algorithm of Web search results based on rough set. Software Engineering and Service Science (ICSESS).
[22]
Ngo Chi Lang. 2003. A tolerance rough set approach to clustering web search results. Poland: Warsaw University
[23]
Sahami, M., and Heilman, T. 2006. A web-based kernel function for measuring the similarity of short text snippets. In Proc. of WWW '06.
[24]
R. Navigli and S. Ponzetto. 2012. BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network. Artificial Intelligence, 193, Elsevier, 217--250.
[25]
Bekkali M., Lachkar A. 2017. Web Search Engine-Based Representation for Arabic Tweets Categorization. In: Kaya M., Erdoğan Ö., Rokne J. (eds) From Social Data Mining and Analysis to Prediction and Community Detection. Lecture Notes in Social Networks. Springer

Cited By

View all
  • (2021)Hybrid approach for semantic similarity calculation between Tamil wordsInternational Journal of Innovative Computing and Applications10.1504/ijica.2021.11360912:1(13-23)Online publication date: 1-Jan-2021

Index Terms

  1. Context-based Arabic Word Sense Disambiguation using Short Text Similarity Measure

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      SITA'18: Proceedings of the 12th International Conference on Intelligent Systems: Theories and Applications
      October 2018
      301 pages
      ISBN:9781450364621
      DOI:10.1145/3289402
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 October 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Arabic Language
      2. Context Concept
      3. Rough Set Theory
      4. Short Text Similarity
      5. WSD
      6. Word Sense Disambiguation

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      SITA'18
      SITA'18: THEORIES AND APPLICATIONS
      October 24 - 25, 2018
      Rabat, Morocco

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Hybrid approach for semantic similarity calculation between Tamil wordsInternational Journal of Innovative Computing and Applications10.1504/ijica.2021.11360912:1(13-23)Online publication date: 1-Jan-2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media