Parameter optimization for machine-learning of word sense disambiguation

Abstract

Various Machine Learning (ML) approaches have been demonstrated to produce relatively successful Word Sense Disambiguation (WSD) systems. There are still unexplained differences among the performance measurements of different algorithms, hence it is warranted to deepen the investigation into which algorithm has the right ‘bias’ for this task. In this paper, we show that this is not easy to accomplish, due to intricate interactions between information sources, parameter settings, and properties of the training data. We investigate the impact of parameter optimization on generalization accuracy in a memory-based learning approach to English and Dutch WSD. A ‘word-expert’ architecture was adopted, yielding a set of classifiers, each specialized in one single wordform. The experts consist of multiple memory-based learning classifiers, each taking different information sources as input, combined in a voting scheme. We optimized the architectural and parametric settings for each individual word-expert by performing cross-validation experiments on the learning material. The results of these experiments show that the variation of both the algorithmic parameters and the information sources available to the classifiers leads to large fluctuations in accuracy. We demonstrate that optimization per word-expert leads to an overall significant improvement in the generalization accuracies of the produced WSD systems.

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Daelemans, Walter Hoste, Véronique De Meulder, Fien and Naudts, Bart 2003. Machine Learning: ECML 2003. Vol. 2837, Issue. , p. 84.

Nica, Iulia Martí, Ma. Antònia Montoyo, Andrés and Vázquez, Sonia 2004. Computational Linguistics and Intelligent Text Processing. Vol. 2945, Issue. , p. 188.

Le, Cuong Anh Huynh, Van-Nam and Shimazu, Akira 2005. Advances in Knowledge Discovery and Data Mining. Vol. 3518, Issue. , p. 262.

Le, Cuong Anh Huynh, Van-Nam Dam, Hieu-Chi and Shimazu, Akira 2005. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Vol. 3641, Issue. , p. 512.

Le, Cuong Anh Huynh, Van-Nam and Shimazu, Akira 2005. Machine Learning and Data Mining in Pattern Recognition. Vol. 3587, Issue. , p. 516.

Leroy, Gondy and Rindflesch, Thomas C. 2005. Effects of information and machine learning algorithms on word sense disambiguation with small datasets. International Journal of Medical Informatics, Vol. 74, Issue. 7-8, p. 573.

Saarikoski, Harri M. T. 2005. Natural Language Processing and Information Systems. Vol. 3513, Issue. , p. 369.

Saarikoski, Harri M. T. Legrand, Steve and Gelbukh, Alexander 2006. MICAI 2006: Advances in Artificial Intelligence. Vol. 4293, Issue. , p. 855.

Anh-Cuong Le Van-Nam Huynh Shimazu, A. and Dam, H. 2006. Weighted combination of classifiers for word sense disambiguation based on dempster-shafer theory. p. 133.

Saarikoski, Harri M. T. and Legrand, Steve 2006. Progress in Pattern Recognition, Image Analysis and Applications. Vol. 4225, Issue. , p. 864.

Saarikoski, Harri M. T. Legrand, Steve and Gelbukh, Alexander 2007. Computational Linguistics and Intelligent Text Processing. Vol. 4394, Issue. , p. 253.

Le, Cuong Anh Huynh, Van-Nam Shimazu, Akira and Nakamori, Yoshiteru 2007. Combining classifiers for word sense disambiguation based on Dempster–Shafer theory and OWA operators. Data & Knowledge Engineering, Vol. 63, Issue. 2, p. 381.

Màrquez, Lluís Escudero, Gerard Martínez, David and Rigau, German 2007. Word Sense Disambiguation. Vol. 33, Issue. , p. 167.

Padó, Sebastian and Lapata, Mirella 2007. Dependency-Based Construction of Semantic Space Models. Computational Linguistics, Vol. 33, Issue. 2, p. 161.

Lin, Shou-de and Verspoor, Karin 2008. Computational Linguistics and Intelligent Text Processing. Vol. 4919, Issue. , p. 287.

Lavelli, Alberto Califf, Mary Elaine Ciravegna, Fabio Freitag, Dayne Giuliano, Claudio Kushmerick, Nicholas Romano, Lorenza and Ireson, Neil 2008. Evaluation of machine learning-based information extraction algorithms: criticisms and recommendations. Language Resources and Evaluation, Vol. 42, Issue. 4, p. 361.

Navigli, Roberto 2009. Word sense disambiguation. ACM Computing Surveys, Vol. 41, Issue. 2, p. 1.

LI, JIANGUO and BREW, CHRIS 2010. Class-based approach to disambiguating Levin verbs. Natural Language Engineering, Vol. 16, Issue. 4, p. 391.

Huynh, Van-Nam Nguyen, Tri Thanh and Le, Cuong Anh 2010. Adaptively entropy-based weighting classifiers in combination using Dempster–Shafer theory for word sense disambiguation. Computer Speech & Language, Vol. 24, Issue. 3, p. 461.

2010. The Handbook of Computational Linguistics and Natural Language Processing. p. 655.

Download full list

Article contents

Abstract

Access options

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Parameter optimization for machine-learning of word sense disambiguation

Abstract

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests