More Web Proxy on the site http://driver.im/

research-article

Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation

Authors:

Rao Muhammad Adeel Nawab,

Mark StevensonAuthors Info & Claims

Transactions on Asian and Low-Resource Language Information Processing, Volume 21, Issue 2

Article No.: 38, Pages 1 - 16

https://doi.org/10.1145/3477578

Published: 31 October 2021 Publication History

Abstract

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.

References

[1]

Muhammad Abid, Asad Habib, Jawad Ashraf, and Abdul Shahid. 2018. Urdu word sense disambiguation using machine learning approach. Cluster Comput. 21, 1 (2018), 515–522.

[2]

E. Agirre, O. Lopez de Lacalle, C. Fellbaum, A. Marchetti, A. Toral, P. T. J. M. Vossen, L. Màrques, and R. Wicentowski. 2009. All-words Word Sense Disambiguation on a Specific Domain (SemEval-2010 Task 17). In SEW2009@ NAACL-HLT2009 te Boulder, Colorado, USA. Association for Computational Linguistics (ACL), 123–128.

Digital Library

[3]

Mohammed N. A. Ali, Guanzheng Tan, and Aamir Hussain. 2018. Bidirectional recurrent neural network approach for Arabic named entity recognition. Fut. Internet 10, 12 (2018), 123.

[4]

Edgar Altszyler, Mariano Sigman, and Diego Fernández Slezak. 2017. Corpus specificity in LSA and Word2vec: The role of out-of-domain documents. arXiv preprint arXiv:1712.10054 (2017).

[5]

Sophia Ananiadou, Paul Thompson, and Raheel Nawaz. 2013. Enhancing search: Events and their discourse context. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 318–334.

Digital Library

[6]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

[7]

Regina Barzilay and Michael Elhadad. 1999. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization (Madrid, Spain). 10–17.

[8]

Riza Theresa Batista-Navarro, Georgios Kontonatsios, Claudiu Mihăilă, Paul Thompson, Rafal Rak, Raheel Nawaz, Ioannis Korkontzelos, and Sophia Ananiadou. 2013. Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 559–571.

Digital Library

[9]

Ben Bright Benuwa, Yong Zhao Zhan, Benjamin Ghansah, Dickson Keddy Wornyo, and Frank Banaseka Kataka. 2016. A review of deep machine learning. In International Journal of Engineering Research in Africa, Vol. 24. Trans Tech Publications, 124–136.

[10]

Kevin Black, Eric K. Ringger, Paul Felt, Kevin D. Seppi, Kristian Heal, and Deryle Lonsdale. 2014. Evaluating lemmatization models for machine-assisted corpus-dictionary linkage. In International Conference on Language Resources and Evaluation. 3798–3805.

[11]

Rebecca Bruce and Janyce Wiebe. 1994. Word-sense disambiguation using decomposable models. In 32nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 139–146.

Digital Library

[12]

Rui Cao, Jing Bai, Wen Ma, and Hiroyuki Shinnou. 2019. Semi-supervised learning for all-words WSD using self-learning and fine-tuning. In 33rd Pacific Asia Conference on Language, Information and Computation. Waseda Institute for the Study of Language and Information, 356–361.

[13]

Aykut Çayir, Işil Yenidoğan, and Hasan Dağ. 2018. Feature extraction based on deep learning for some traditional machine learning methods. In 3rd International Conference on Computer Science and Engineering (UBMK’18). IEEE, 494–497.

[14]

Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13, 4 (1999), 359–394.

Digital Library

[15]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[16]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).

[17]

Guowen Dai, Changxi Ma, and Xuecai Xu. 2019. Short-term traffic flow prediction method for urban road sections based on space–time analysis and GRU. IEEE Access 7 (2019), 143025–143035.

[18]

Ali Daud, Wahab Khan, and Dunren Che. 2017. Urdu language processing: a survey. Artif. Intell. Rev. 47, 3 (2017), 279–311.

Digital Library

[19]

Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In IEEE Conference on Computer Vision and Pattern Recognition. 2625–2634.

[20]

Philip Edmonds and Scott Cotton. 2001. SENSEVAL-2: Overview. In 2nd International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics, 1–5.

Digital Library

[21]

Jeffrey L. Elman. 1990. Finding structure in time. Cogn. Sci. 14, 2 (1990), 179–211.

[22]

W. Nelson Francis and Henry Kucera. 1979. Brown corpus manual. Lett. Ed. 5, 2 (1979), 7.

[23]

Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning. Vol. 1. The MIT Press, Cambridge, MA.

Digital Library

[24]

Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645–6649.

[25]

Samar Haider. 2018. Urdu word embeddings. In 11th International Conference on Language Resources and Evaluation (LREC’18).

[26]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.

Digital Library

[27]

Nooshin Jafari, Kim Adams, Mahdi Tavakoli, Sandra Wiebe, and Heidi Janz. 2018. Usability testing of a developed assistive robotic system with virtual assistance for individuals with cerebral palsy: A case study. Disabil. Rehab.: Assist. Technol. 13, 6 (2018), 517–522.

[28]

Rie Johnson and Tong Zhang. 2016. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv preprint arXiv:1602.02373 (2016).

Digital Library

[29]

Mikael Kågebäck and Hans Salomonsson. 2016. Word sense disambiguation using a bidirectional LSTM. arXiv preprint arXiv:1606.03568 (2016).

[30]

Kai Ren and Shi-Wen Wang. 2016. Improved convolutional neural network for biomedical word sense disambiguation with enhanced context feature modeling. J. Digit. Inf. Manag. 14, 6 (2016).

[31]

Deepa Karuppaiah and P. M. Durai Raj Vincent. 2021. Word sense disambiguation in Tamil using Indo-WordNet and cross-language semantic similarity. Int. J. Intell. Enterp. 8, 1 (2021), 62–73.

[32]

Adam Kilgarriff. 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103–111.

[33]

Adam Kilgarriff. 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103–111.

[34]

Balaji Krishnamurthy, Shagun Sodhani, Aarushi Arora, and Milan Aggarwal. 2018. Conversational agent for search. US Patent App. 15/419,497.

[35]

Jitendra Kumar, Rimsha Goomer, and Ashutosh Kumar Singh. 2018. Long short term memory recurrent neural network (LSTM-RNN) based workload forecasting model for cloud datacenters. Proced. Comput. Sci. 125 (2018), 676–682.

[36]

Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, and Lucia Specia. 2018. Sheffield submissions for WMT18 multimodal translation shared task. In 3rd Conference on Machine Translation: Shared Task Papers. 624–631.

[37]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436.

[38]

Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Conference on Advances in Neural Information Processing Systems. 2177–2185.

Digital Library

[39]

Yang Li and Tao Yang. 2018. Word embedding for understanding natural language: A survey. In Guide to Big Data Applications. Springer, 83–104.

[40]

Hong Liang, Xiao Sun, Yunlei Sun, and Yuan Gao. 2017. Text feature extraction based on deep learning: A review. EURASIP J. Wirel. Commun. Netw. 2017, 1 (2017), 1–12.

[41]

Wang Ling, Chris Dyer, Alan W. Black, and Isabel Trancoso. 2015. Two/too simple adaptations of Word2vec for syntax problems. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1299–1304.

[42]

Andrew McCallum, Kamal Nigam, et al. 1998. A comparison of event models for naive Bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, Vol. 752. Citeseer, 41–48.

[43]

Amine Medad, Mauro Gaio, Ludovic Moncla, Sébastien Mustière, and Yannick Le Nir. 2020. Comparing supervised learning algorithms for spatial nominal entity recognition. AGILE: GISci. Series 1 (2020), 1–18.

[44]

Rada Mihalcea, Timothy Chklovski, and Adam Kilgarriff. 2004. The Senseval-3 English lexical sample task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.

[45]

Rada Mihalcea and Ehsanul Faruque. 2004. Senselearner: Minimally supervised word sense disambiguation for all words in open text. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 155–158.

[46]

Nikesh Narayanan and Dorothy Furber Byers. 2017. Improving web scale discovery services. Ann. Libr. Inf. Stud. 64 (2017), 276–279.

[47]

Asma Naseer and Sarmad Hussain. 2009. Supervised word sense disambiguation for Urdu using Bayesian classification. Proceeding of Conference on Language & Technology (CLT10).

[48]

Roberto Navigli. 2009. Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2 (2009), 10.

Digital Library

[49]

Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. SemEval-2013 Task 12: Multilingual word sense disambiguation. In 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval’13). Association for Computational Linguistics, 222–231. Retrieved from https://www.aclweb.org/anthology/S13-2040.

[50]

Raheel Nawaz, Paul Thompson, and Sophia Ananiadou. 2012. Identification of Manner in Bio-Events. In International Conference on Language Resources and Evaluation. 3505–3510.

[51]

Hwee Tou Ng, Chung Yong Lim, and Shou King Foo. 1999. A case study on inter-annotator agreement for word sense disambiguation. In SIGLEX99: Standardizing Lexical Resources. 9–13.

[52]

Varinder Pal Singh and Parteek Kumar. 2020. Word sense disambiguation for Punjabi language using deep learning techniques. Neural Comput. Applic. 32, 8 (2020), 2963–2973.

[53]

Bo Pang, Lillian Lee et al. 2008. Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2, 1–2 (2008), 1–135.

Digital Library

[54]

Rebecca J. Passonneau, Collin Baker, Christiane Fellbaum, and Nancy Ide. 2012. The MASC word sense sentence corpus. In International Conference on Language Resources and Evaluation.

[55]

Yangtuo Peng and Hui Jiang. 2015. Leverage financial news to predict stock price movements using word embeddings and deep neural networks. arXiv preprint arXiv:1506.07220 (2015).

[56]

Mohammad Taher Pilehvar and Jose Camacho-Collados. 2018. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. arXiv preprint arXiv:1808.09121 (2018).

[57]

Alexander Popov. 2017. Word sense disambiguation with recurrent neural networks. In Student Research Workshop Associated with RANLP. 25–34.

[58]

Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017. Neural sequence learning models for word sense disambiguation. In Conference on Empirical Methods in Natural Language Processing. 1156–1167.

[59]

Ali Saeed, Rao Muhammad Adeel Nawab, Mark Stevenson, and Paul Rayson. 2018. A word sense disambiguation corpus for Urdu. Lang. Resour. Eval. 53, 3 (2018), 397–418.

[60]

Mark Stevenson, Paul Rayson, Ali Saeed, and Rao Muhammad Adeel Nawab. 2019. A sense annotated corpus for all-words Urdu word sense disambiguation. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 4 (2019), 1–14.

Digital Library

[61]

Fatma Shaheen, Brijesh Verma, and Md Asafuddoula. 2016. Impact of automatic feature extraction in deep learning architecture. In International Conference on Digital Image Computing: Techniques and Applications (DICTA’16). IEEE, 1–8.

[62]

Ying Shan, T. Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and J. C. Mao. 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 255–262.

Digital Library

[63]

Matthew Shardlow, Riza Batista-Navarro, Paul Thompson, Raheel Nawaz, John McNaught, and Sophia Ananiadou. 2018. Identification of research hypotheses and new knowledge from scientific literature. BMC Med. Inform. Decis. Mak. 18, 1 (2018), 46.

[64]

Abdullah Aziz Sharfuddin, Md Nafis Tihami, and Md Saiful Islam. 2018. A deep recurrent neural network with biLSTM model for sentiment classification. In International Conference on Bangla Speech and Language Processing (ICBSLP’18). IEEE, 1–4.

[65]

Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2019. A comparative analysis of forecasting financial time series using ARIMA, LSTM, and biLSTM. arXiv preprint arXiv:1911.09512 (2019).

[66]

Satyendr Singh and Tanveer J. Siddiqui. 2016. Sense annotated Hindi corpus. In International Conference on Asian Language Processing (IALP’16). IEEE, 22–25.

[67]

Xue-Ren Sun, Shao-He Lv, Xiao-Dong Wang, and Dong Wang. 2017. Chinese word sense disambiguation using a LSTM. In ITM Web of Conferences, Vol. 12. EDP Sciences, 01027.

[68]

Paul Thompson, Raheel Nawaz, John McNaught, and Sophia Ananiadou. 2017. Enriching news events with meta-knowledge information. Lang. Resour. Eval. 51, 2 (2017), 409–438.

Digital Library

[69]

Marisa Ulivieri, Elisabetta Guazzini, Francesca Bertagna, and Nicoletta Calzolari. 2004. Senseval-3: The Italian all-words task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.

[70]

Xinglong Wang, Rafal Rak, Angelo Restificar, Chikashi Nobata, C. J. Rupp, Riza Theresa B. Batista-Navarro, Raheel Nawaz, and Sophia Ananiadou. 2011. Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature. BMC Bioinf. 12, 8 (2011), S11.

[71]

Dayu Yuan, Julian Richardson, Ryan Doherty, Colin Evans, and Eric Altendorf. 2016. Semi-supervised word sense disambiguation with neural models. arXiv preprint arXiv:1603.07012 (2016).

[72]

Shu Zhang, Dequan Zheng, Xinchen Hu, and Ming Yang. 2015. Bidirectional long short-term memory networks for relation classification. In 29th Pacific Asia Conference on Language, Information and Computation. 73–78.

[73]

Botao Zhong, Xuejiao Xing, Hanbin Luo, Qirui Zhou, Heng Li, Timothy Rose, and Weili Fang. 2020. Deep learning-based extraction of construction procedural constraints from construction regulations. Adv. Eng. Inform. 43 (2020), 101003.

Digital Library

[74]

Shuigeng Zhou and Jihong Guan. 2002. Chinese documents classification based on N-grams. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 405–414.

Digital Library

Cited By

Habtamu RGizachew B(2024)State-of-the-Art Approaches to Word Sense Disambiguation: A Multilingual InvestigationPan-African Conference on Artificial Intelligence10.1007/978-3-031-57624-9_10(176-202)Online publication date: 7-Apr-2024
https://doi.org/10.1007/978-3-031-57624-9_10
Farhat Ullah MSaeed AHussain N(2023)Comparison of Pre-trained vs Custom-trained Word Embedding Models for Word Sense DisambiguationADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal10.14201/adcaij.3108412:1(e31084)Online publication date: 1-Nov-2023
https://doi.org/10.14201/adcaij.31084

Index Terms

Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
A word sense disambiguation corpus for Urdu
Abstract
The aim of word sense disambiguation (WSD) is to correctly identify the meaning of a word in context. All natural languages exhibit word sense ambiguities and these are often hard to resolve automatically. Consequently WSD is considered an ...
Unsupervised translated word sense disambiguation in constructing bilingual lexical database
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

The performance of a machine translation system depends on the availability of bilingual lexical dictionary and completion of its word sense disambiguation performance. Word sense disambiguation plays a vital role in several applications such as machine ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21, Issue 2

March 2022

413 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3494070

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 October 2021

Accepted: 01 July 2021

Revised: 01 May 2021

Received: 01 September 2019

Published in TALLIP Volume 21, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
314
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Habtamu RGizachew B(2024)State-of-the-Art Approaches to Word Sense Disambiguation: A Multilingual InvestigationPan-African Conference on Artificial Intelligence10.1007/978-3-031-57624-9_10(176-202)Online publication date: 7-Apr-2024
https://doi.org/10.1007/978-3-031-57624-9_10
Farhat Ullah MSaeed AHussain N(2023)Comparison of Pre-trained vs Custom-trained Word Embedding Models for Word Sense DisambiguationADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal10.14201/adcaij.3108412:1(e31084)Online publication date: 1-Nov-2023
https://doi.org/10.14201/adcaij.31084

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents