[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation

Published: 31 October 2021 Publication History

Abstract

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.

References

[1]
Muhammad Abid, Asad Habib, Jawad Ashraf, and Abdul Shahid. 2018. Urdu word sense disambiguation using machine learning approach. Cluster Comput. 21, 1 (2018), 515–522.
[2]
E. Agirre, O. Lopez de Lacalle, C. Fellbaum, A. Marchetti, A. Toral, P. T. J. M. Vossen, L. Màrques, and R. Wicentowski. 2009. All-words Word Sense Disambiguation on a Specific Domain (SemEval-2010 Task 17). In SEW2009@ NAACL-HLT2009 te Boulder, Colorado, USA. Association for Computational Linguistics (ACL), 123–128.
[3]
Mohammed N. A. Ali, Guanzheng Tan, and Aamir Hussain. 2018. Bidirectional recurrent neural network approach for Arabic named entity recognition. Fut. Internet 10, 12 (2018), 123.
[4]
Edgar Altszyler, Mariano Sigman, and Diego Fernández Slezak. 2017. Corpus specificity in LSA and Word2vec: The role of out-of-domain documents. arXiv preprint arXiv:1712.10054 (2017).
[5]
Sophia Ananiadou, Paul Thompson, and Raheel Nawaz. 2013. Enhancing search: Events and their discourse context. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 318–334.
[6]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
[7]
Regina Barzilay and Michael Elhadad. 1999. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization (Madrid, Spain). 10–17.
[8]
Riza Theresa Batista-Navarro, Georgios Kontonatsios, Claudiu Mihăilă, Paul Thompson, Rafal Rak, Raheel Nawaz, Ioannis Korkontzelos, and Sophia Ananiadou. 2013. Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 559–571.
[9]
Ben Bright Benuwa, Yong Zhao Zhan, Benjamin Ghansah, Dickson Keddy Wornyo, and Frank Banaseka Kataka. 2016. A review of deep machine learning. In International Journal of Engineering Research in Africa, Vol. 24. Trans Tech Publications, 124–136.
[10]
Kevin Black, Eric K. Ringger, Paul Felt, Kevin D. Seppi, Kristian Heal, and Deryle Lonsdale. 2014. Evaluating lemmatization models for machine-assisted corpus-dictionary linkage. In International Conference on Language Resources and Evaluation. 3798–3805.
[11]
Rebecca Bruce and Janyce Wiebe. 1994. Word-sense disambiguation using decomposable models. In 32nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 139–146.
[12]
Rui Cao, Jing Bai, Wen Ma, and Hiroyuki Shinnou. 2019. Semi-supervised learning for all-words WSD using self-learning and fine-tuning. In 33rd Pacific Asia Conference on Language, Information and Computation. Waseda Institute for the Study of Language and Information, 356–361.
[13]
Aykut Çayir, Işil Yenidoğan, and Hasan Dağ. 2018. Feature extraction based on deep learning for some traditional machine learning methods. In 3rd International Conference on Computer Science and Engineering (UBMK’18). IEEE, 494–497.
[14]
Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13, 4 (1999), 359–394.
[15]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[16]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[17]
Guowen Dai, Changxi Ma, and Xuecai Xu. 2019. Short-term traffic flow prediction method for urban road sections based on space–time analysis and GRU. IEEE Access 7 (2019), 143025–143035.
[18]
Ali Daud, Wahab Khan, and Dunren Che. 2017. Urdu language processing: a survey. Artif. Intell. Rev. 47, 3 (2017), 279–311.
[19]
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. 2015. Long-term recurrent convolutional networks for visual recognition and description. In IEEE Conference on Computer Vision and Pattern Recognition. 2625–2634.
[20]
Philip Edmonds and Scott Cotton. 2001. SENSEVAL-2: Overview. In 2nd International Workshop on Evaluating Word Sense Disambiguation Systems. Association for Computational Linguistics, 1–5.
[21]
Jeffrey L. Elman. 1990. Finding structure in time. Cogn. Sci. 14, 2 (1990), 179–211.
[22]
W. Nelson Francis and Henry Kucera. 1979. Brown corpus manual. Lett. Ed. 5, 2 (1979), 7.
[23]
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning. Vol. 1. The MIT Press, Cambridge, MA.
[24]
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645–6649.
[25]
Samar Haider. 2018. Urdu word embeddings. In 11th International Conference on Language Resources and Evaluation (LREC’18).
[26]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.
[27]
Nooshin Jafari, Kim Adams, Mahdi Tavakoli, Sandra Wiebe, and Heidi Janz. 2018. Usability testing of a developed assistive robotic system with virtual assistance for individuals with cerebral palsy: A case study. Disabil. Rehab.: Assist. Technol. 13, 6 (2018), 517–522.
[28]
Rie Johnson and Tong Zhang. 2016. Supervised and semi-supervised text categorization using LSTM for region embeddings. arXiv preprint arXiv:1602.02373 (2016).
[29]
Mikael Kågebäck and Hans Salomonsson. 2016. Word sense disambiguation using a bidirectional LSTM. arXiv preprint arXiv:1606.03568 (2016).
[30]
Kai Ren and Shi-Wen Wang. 2016. Improved convolutional neural network for biomedical word sense disambiguation with enhanced context feature modeling. J. Digit. Inf. Manag. 14, 6 (2016).
[31]
Deepa Karuppaiah and P. M. Durai Raj Vincent. 2021. Word sense disambiguation in Tamil using Indo-WordNet and cross-language semantic similarity. Int. J. Intell. Enterp. 8, 1 (2021), 62–73.
[32]
Adam Kilgarriff. 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103–111.
[33]
Adam Kilgarriff. 2004. How dominant is the commonest sense of a word? In International Conference on Text, Speech and Dialogue. Springer, 103–111.
[34]
Balaji Krishnamurthy, Shagun Sodhani, Aarushi Arora, and Milan Aggarwal. 2018. Conversational agent for search. US Patent App. 15/419,497.
[35]
Jitendra Kumar, Rimsha Goomer, and Ashutosh Kumar Singh. 2018. Long short term memory recurrent neural network (LSTM-RNN) based workload forecasting model for cloud datacenters. Proced. Comput. Sci. 125 (2018), 676–682.
[36]
Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, and Lucia Specia. 2018. Sheffield submissions for WMT18 multimodal translation shared task. In 3rd Conference on Machine Translation: Shared Task Papers. 624–631.
[37]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436.
[38]
Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Conference on Advances in Neural Information Processing Systems. 2177–2185.
[39]
Yang Li and Tao Yang. 2018. Word embedding for understanding natural language: A survey. In Guide to Big Data Applications. Springer, 83–104.
[40]
Hong Liang, Xiao Sun, Yunlei Sun, and Yuan Gao. 2017. Text feature extraction based on deep learning: A review. EURASIP J. Wirel. Commun. Netw. 2017, 1 (2017), 1–12.
[41]
Wang Ling, Chris Dyer, Alan W. Black, and Isabel Trancoso. 2015. Two/too simple adaptations of Word2vec for syntax problems. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1299–1304.
[42]
Andrew McCallum, Kamal Nigam, et al. 1998. A comparison of event models for naive Bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, Vol. 752. Citeseer, 41–48.
[43]
Amine Medad, Mauro Gaio, Ludovic Moncla, Sébastien Mustière, and Yannick Le Nir. 2020. Comparing supervised learning algorithms for spatial nominal entity recognition. AGILE: GISci. Series 1 (2020), 1–18.
[44]
Rada Mihalcea, Timothy Chklovski, and Adam Kilgarriff. 2004. The Senseval-3 English lexical sample task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.
[45]
Rada Mihalcea and Ehsanul Faruque. 2004. Senselearner: Minimally supervised word sense disambiguation for all words in open text. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 155–158.
[46]
Nikesh Narayanan and Dorothy Furber Byers. 2017. Improving web scale discovery services. Ann. Libr. Inf. Stud. 64 (2017), 276–279.
[47]
Asma Naseer and Sarmad Hussain. 2009. Supervised word sense disambiguation for Urdu using Bayesian classification. Proceeding of Conference on Language & Technology (CLT10).
[48]
Roberto Navigli. 2009. Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2 (2009), 10.
[49]
Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. SemEval-2013 Task 12: Multilingual word sense disambiguation. In 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval’13). Association for Computational Linguistics, 222–231. Retrieved from https://www.aclweb.org/anthology/S13-2040.
[50]
Raheel Nawaz, Paul Thompson, and Sophia Ananiadou. 2012. Identification of Manner in Bio-Events. In International Conference on Language Resources and Evaluation. 3505–3510.
[51]
Hwee Tou Ng, Chung Yong Lim, and Shou King Foo. 1999. A case study on inter-annotator agreement for word sense disambiguation. In SIGLEX99: Standardizing Lexical Resources. 9–13.
[52]
Varinder Pal Singh and Parteek Kumar. 2020. Word sense disambiguation for Punjabi language using deep learning techniques. Neural Comput. Applic. 32, 8 (2020), 2963–2973.
[53]
Bo Pang, Lillian Lee et al. 2008. Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2, 1–2 (2008), 1–135.
[54]
Rebecca J. Passonneau, Collin Baker, Christiane Fellbaum, and Nancy Ide. 2012. The MASC word sense sentence corpus. In International Conference on Language Resources and Evaluation.
[55]
Yangtuo Peng and Hui Jiang. 2015. Leverage financial news to predict stock price movements using word embeddings and deep neural networks. arXiv preprint arXiv:1506.07220 (2015).
[56]
Mohammad Taher Pilehvar and Jose Camacho-Collados. 2018. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. arXiv preprint arXiv:1808.09121 (2018).
[57]
Alexander Popov. 2017. Word sense disambiguation with recurrent neural networks. In Student Research Workshop Associated with RANLP. 25–34.
[58]
Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017. Neural sequence learning models for word sense disambiguation. In Conference on Empirical Methods in Natural Language Processing. 1156–1167.
[59]
Ali Saeed, Rao Muhammad Adeel Nawab, Mark Stevenson, and Paul Rayson. 2018. A word sense disambiguation corpus for Urdu. Lang. Resour. Eval. 53, 3 (2018), 397–418.
[60]
Mark Stevenson, Paul Rayson, Ali Saeed, and Rao Muhammad Adeel Nawab. 2019. A sense annotated corpus for all-words Urdu word sense disambiguation. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 4 (2019), 1–14.
[61]
Fatma Shaheen, Brijesh Verma, and Md Asafuddoula. 2016. Impact of automatic feature extraction in deep learning architecture. In International Conference on Digital Image Computing: Techniques and Applications (DICTA’16). IEEE, 1–8.
[62]
Ying Shan, T. Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and J. C. Mao. 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 255–262.
[63]
Matthew Shardlow, Riza Batista-Navarro, Paul Thompson, Raheel Nawaz, John McNaught, and Sophia Ananiadou. 2018. Identification of research hypotheses and new knowledge from scientific literature. BMC Med. Inform. Decis. Mak. 18, 1 (2018), 46.
[64]
Abdullah Aziz Sharfuddin, Md Nafis Tihami, and Md Saiful Islam. 2018. A deep recurrent neural network with biLSTM model for sentiment classification. In International Conference on Bangla Speech and Language Processing (ICBSLP’18). IEEE, 1–4.
[65]
Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2019. A comparative analysis of forecasting financial time series using ARIMA, LSTM, and biLSTM. arXiv preprint arXiv:1911.09512 (2019).
[66]
Satyendr Singh and Tanveer J. Siddiqui. 2016. Sense annotated Hindi corpus. In International Conference on Asian Language Processing (IALP’16). IEEE, 22–25.
[67]
Xue-Ren Sun, Shao-He Lv, Xiao-Dong Wang, and Dong Wang. 2017. Chinese word sense disambiguation using a LSTM. In ITM Web of Conferences, Vol. 12. EDP Sciences, 01027.
[68]
Paul Thompson, Raheel Nawaz, John McNaught, and Sophia Ananiadou. 2017. Enriching news events with meta-knowledge information. Lang. Resour. Eval. 51, 2 (2017), 409–438.
[69]
Marisa Ulivieri, Elisabetta Guazzini, Francesca Bertagna, and Nicoletta Calzolari. 2004. Senseval-3: The Italian all-words task. In 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text.
[70]
Xinglong Wang, Rafal Rak, Angelo Restificar, Chikashi Nobata, C. J. Rupp, Riza Theresa B. Batista-Navarro, Raheel Nawaz, and Sophia Ananiadou. 2011. Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature. BMC Bioinf. 12, 8 (2011), S11.
[71]
Dayu Yuan, Julian Richardson, Ryan Doherty, Colin Evans, and Eric Altendorf. 2016. Semi-supervised word sense disambiguation with neural models. arXiv preprint arXiv:1603.07012 (2016).
[72]
Shu Zhang, Dequan Zheng, Xinchen Hu, and Ming Yang. 2015. Bidirectional long short-term memory networks for relation classification. In 29th Pacific Asia Conference on Language, Information and Computation. 73–78.
[73]
Botao Zhong, Xuejiao Xing, Hanbin Luo, Qirui Zhou, Heng Li, Timothy Rose, and Weili Fang. 2020. Deep learning-based extraction of construction procedural constraints from construction regulations. Adv. Eng. Inform. 43 (2020), 101003.
[74]
Shuigeng Zhou and Jihong Guan. 2002. Chinese documents classification based on N-grams. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 405–414.

Cited By

View all
  • (2024)State-of-the-Art Approaches to Word Sense Disambiguation: A Multilingual InvestigationPan-African Conference on Artificial Intelligence10.1007/978-3-031-57624-9_10(176-202)Online publication date: 7-Apr-2024
  • (2023)Comparison of Pre-trained vs Custom-trained Word Embedding Models for Word Sense DisambiguationADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal10.14201/adcaij.3108412:1(e31084)Online publication date: 1-Nov-2023

Index Terms

  1. Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 2
    March 2022
    413 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3494070
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 October 2021
    Accepted: 01 July 2021
    Revised: 01 May 2021
    Received: 01 September 2019
    Published in TALLIP Volume 21, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Word Sense Disambiguation
    2. deep learning
    3. Urdu All-Words WSD task
    4. Urdu Lexical Sample WSD task
    5. Recurrent Neural Network
    6. Long-Short Term Memory
    7. Gated Recurrent Units
    8. Bidirectional Long-Short Term Memory

    Qualifiers

    • Research-article
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 09 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)State-of-the-Art Approaches to Word Sense Disambiguation: A Multilingual InvestigationPan-African Conference on Artificial Intelligence10.1007/978-3-031-57624-9_10(176-202)Online publication date: 7-Apr-2024
    • (2023)Comparison of Pre-trained vs Custom-trained Word Embedding Models for Word Sense DisambiguationADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal10.14201/adcaij.3108412:1(e31084)Online publication date: 1-Nov-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media