Abstract
In this paper, we propose an optimized method to segment the Uyghur word. We consider the optimization as a classification problem; the features are extracted from Uyghur-Chinese bilingual corpus. Experimental results show that with our method the performance of Uyghur-Chinese machine translation improved significantly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Larkey, L.S., Ballesteros, L., Connell, M. E.: Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis. In: Proceedings of the 25th ACM SIGIR, pp. 275–282 (2002)
Nguyen, T.L., Vogel, S., Smith, N.A.: Nonparametric word segmentation for machine translation. In: Proceedings of the 23rd COLING, pp. 815–823 (2010)
Lafferty, J., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 8th ICML, pp. 282–289 (2001)
McDonald, J.H: Handbook of Biological Statistics. 2nd edn pp. 173–181 (2009)
Acknowledgements
This work is supported by the National High Technology Research and Development Program of China (No. 2013AA01A607), Strategic Priority Research Program of the Chinese Academy of Sciences (No. XDA06030400), West Light Foundation of Chinese Academy of Sciences (No. XBBS201216), and Key Project of Knowledge Innovation Program of Chinese Academy of Sciences (No. KGZD-EW-501).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mi, C. et al. (2015). Optimized Uyghur Segmentation for Statistical Machine Translation. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2015. Lecture Notes in Computer Science(), vol 9103. Springer, Cham. https://doi.org/10.1007/978-3-319-19581-0_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-19581-0_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19580-3
Online ISBN: 978-3-319-19581-0
eBook Packages: Computer ScienceComputer Science (R0)