[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Modeling Monolingual Character Alignment for Automatic Evaluation of Chinese Translation

Published: 28 January 2016 Publication History

Abstract

Automatic evaluation of machine translations is an important task. Most existing evaluation metrics rely on matching the same word or letter n-grams. This strategy leads to poor results on Chinese translations because one has to rely merely on matching identical characters. In this article, we propose a new evaluation metric that allows different characters with the same or similar meaning to match. An Indirect Hidden Markov Model (IHMM) is proposed to align the Chinese translation with human references at the character level. In the model, the emission probabilities are estimated by character similarity, including character semantic similarity and character surface similarity, and transition probabilities are estimated by a heuristic distance-based distortion model. When evaluating the submitted output of English-to-Chinese translation systems in the IWSLT’08 CT-EC and NIST’08 EC tasks, the experimental results indicate that the proposed metric has a significantly better correlation with human evaluation than the state-of-the-art machine translation metrics (i.e., BLEU, Meteor Universal, and TESLA-CELAB). This study shows that it is important to allow different characters to match in the evaluation of Chinese translations and that the IHMM is a reasonable approach for the alignment of Chinese characters.

References

[1]
S. Banerjee and A. Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65--72.
[2]
P. F. Brown, J. Cocke, S. A. D. Pietra, V. J. D. Pietra, F. Jelinek, J. D. Lafferty, R. L. Mercer, and P. S. Roossin. 1990. A statistical approach to machine translation. Computational Linguistics 16, 2, 79--85.
[3]
P. F. Brown, S. A. D. Pietra, V. J. D. Pietra, and R. L. Mercer. 1993. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19, 2, 263--311.
[4]
O. Bojar, C. Buck, C. Callison-Burch, C. Federmann, B. Haddow, P. Koehn, C. Monz, M. Post, R. Soricut, and L. Specia. 2013. Findings of the 2013 workshop on statistical machine translation. In Proceedings of the 8th Workshop on Statistical Machine Translation (WMT). Association for Computational Linguistics, Sofia, Bulgaria, 1--44.
[5]
O. Bojar, C. Buck, C. Federmann, B. Haddow, P. Koehn, J. Leveling, C. Monz, P. Pecina, M. Post, H. Saint-Amand, R. Soricut, L. Specia, and A. V. S. Tamchyna. 2014. Findings of the 2014 workshop on statistical machine translation. In Proceedings of the 9th Workshop on Statistical Machine Translation (WMT). Association for Computational Linguistics, Baltimore, USA, 12--58.
[6]
C. Callison-Burch, P. Koehn, C. Monz, M. Post, R. Soricut, and L. Specia. 2012. Findings of the 2012 workshop on statistical machine translation. In Proceedings of the 7th Workshop on Statistical Machine Translation (WMT). Association for Computational Linguistics, Montreal, Quebec, Canada, 10--51.
[7]
Y. S. Chan and H. T. Ng. 2008. MAXSIM: A maximum similarity metric for machine translation evaluation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Columbus, Ohio, 55--62.
[8]
B. Chen, R. Kuhn, and S. Larkin. 2012. PORT: A precision-order-recall MT evaluation metric for tuning. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Jeju Island, Korea, 930--939.
[9]
M. Denkowski and A. Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the 9th Workshop on Statistical Machine Translation (WMT). Association for Computational Linguistics, Baltimore, USA, 376--380.
[10]
Y. Ding and M. Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of the 43rd Annual Meeting of the Association of Computational Linguistics (ACL). Association for Computational Linguistics, Ann Arbor, Michigan, 541--548.
[11]
G. Doddington. 2002. Automatic evaluation of machine translation quality using N-gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research (HLT). Association for Computational Linguistics, San Diego, California, USA, 138--145.
[12]
X. He, M. Yang, J. Gao, P. Nguyen, and R. Moore. 2008. Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Honolulu, Hawaii, 98--107.
[13]
Y. He, J. Zhang, M. Li, L. Fang, Y. Chen, Y. Zhou, and C. Zong. 2008. The CASIA statistical machine translation system for IWSLT’2008. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Association for Computational Linguistics, Hawaii, USA, 85--91.
[14]
M. Fishel, O. Bojar, D. Zeman, and J. Berka. 2011. Automatic translation error analysis. In Proceedings of the 14th International Conference on Text, Speech and Dialogue (TSD). Springer-Verlag Berlin Heidelberg, Pilsen, Czech Republic, 72--79.
[15]
M. Galley, J. Graehl, K. Knight, D. Marcu, S. Deneefe, W. Wang, and I. Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (ACL/COLING). Association for Computational Linguistics, Sydney, Australia, 961--968.
[16]
M. Hopkins and J. May. 2013. Models of translation competitions. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Sofia, Bulgaria, 1416--1424.
[17]
B. Jones, J. Andreas, D. Bauer, K. M. Hermann, and K. Knight. 2012. Semantics-based machine translation with hyperedge replacement grammars. In Proceedings of the 24th International Conference on Computational Linguistics (COLING). The COLING 2012 Organizing Committee, Mumbai, India, 1359--1376.
[18]
P. Koehn. 2012. Simulating human judgment in machine translation evaluation campaigns. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT). Association for Computational Linguistics, Hong Kong, 179--183.
[19]
P. Koehn, F. J. Och, and D. Marcu. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology Conference and the North American Association for Computational Linguistics (HLT-NAACL). Association for Computational Linguistics, Sapporo, Japan, 127--133.
[20]
M. Li, C. Zong, and H. T. Ng. 2011. Automatic evaluation of Chinese translation output: Word-level or character-level? In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL/HLT). Association for Computational Linguistics, Portland, Oregon, USA, 159--164.
[21]
C. Liu and H. T. Ng. 2012. Character-level machine translation evaluation for languages with ambiguous word boundaries. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Jeju Island, Korea, 921--929.
[22]
Y. Liu, Q. Liu, and S. Lin. 2006. Tree-to-string alignment template for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of Association for Computational Linguistics (COLING/ACL). Association for Computational Linguistics, Sydney, Australia, 609--616.
[23]
C. Liu, D. Dahlmeier, and H. T. Ng. 2010. TESLA: Translation evaluation of sentences with linear-programming-based analysis. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, Uppsala, Sweden, 354--359.
[24]
M. Machacek and O. Bojar. 2014. Results of the WMT14 metrics shared task. In Proceedings of the 9th Workshop on Statistical Machine Translation (WMT). Association for Computational Linguistics, Baltimore, USA, 293--301.
[25]
F. J. Och and H. Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL). Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 295--302.
[26]
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL). Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311--318.
[27]
M. Paul. 2008. Overview of the IWSLT’2008 evaluation campaign. In Proceedings of IWSLT ’2008. Association for Computational Linguistics, Hawaii, USA, 1--17.
[28]
K. Sakaguchi, M. Post, and B. Van Durme. 2014. Efficient elicitation of annotations for human evaluation of machine translation. In Proceedings of the 9th Workshop on Statistical Machine Translation (WMT). Association for Computational Linguistics, Baltimore, USA, 1--11.
[29]
M. Snover, B. Dorr, R. Schwartz, J. Makhoul, L. Micciulla, and R. Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas (AMTA). Association for Machine Translation in the Americas, Boston Marriott, Cambridge, Massachusetts, USA, 223--231.
[30]
S. Vogel, H. Ney, and C. Tillmann. 1996. HMM-Based word alignment in statistical translation. In Proceedings of the International Conference on Computational Linguistics (COLING). Association for Computer Linguistics, Copenhagen, Denmark, 836--841.
[31]
K. Wang, C. Zong, and K.-Y. Su. 2010. A character-based joint model for chinese word segmentation. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING). Coling 2010 Organizing Committee, Beijing, China, 1173--1181.
[32]
T. Xiao and J. Zhu. 2013. Unsupervised sub-tree alignment for tree-to-tree translation. Journal of Artificial Intelligence Research (JAIR) 48(2013), 733--782.
[33]
D. Xiong, Q. Liu, and S. Lin, 2006. Maximum entropy based phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL). Association for Computational Linguistics, Sydney, Australia, 521--528.
[34]
D. Xiong and M. Zhang. 2014. A sense-based translation model for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Baltimore, Maryland, 1459--1469.
[35]
F. Zhai, J. Zhang, Y. Zhou, and C. Zong. 2013. Unsupervised tree induction for tree-based translation. Transactions of Association for Computational Linguistics (TACL) 1(2013), 243--254.
[36]
J. Zhang, S. Liu, M. Li, M. Zhou, and C. Zong. 2014. Mind the gap: machine translation by minimizing the semantic gap in embedding space. In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI). Association for the Advancement of Artificial Intelligence, Québec, Canada, 1657--1663.

Cited By

View all
  • (2019)The automatic evaluation model of physical education teaching based on two screening algorithmsJournal of Intelligent & Fuzzy Systems10.3233/JIFS-179176(1-9)Online publication date: 6-Jun-2019
  • (2018)Optimizing Automatic Evaluation of Machine Translation with the ListMLE ApproachACM Transactions on Asian and Low-Resource Language Information Processing10.1145/322604518:1(1-18)Online publication date: 12-Nov-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 15, Issue 3
March 2016
220 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/2876004
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 January 2016
Accepted: 01 August 2015
Revised: 01 June 2015
Received: 01 December 2014
Published in TALLIP Volume 15, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Automatic evaluation
  2. Chinese character
  3. Chinese translation
  4. IHMM
  5. monolingual character alignment
  6. segment-level consistency
  7. synonym matching
  8. system-level correlation
  9. word order

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Natural Science Foundation of China
  • Natural Science Foundation of Jiangxi Provincial Department of Science and Technology of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)The automatic evaluation model of physical education teaching based on two screening algorithmsJournal of Intelligent & Fuzzy Systems10.3233/JIFS-179176(1-9)Online publication date: 6-Jun-2019
  • (2018)Optimizing Automatic Evaluation of Machine Translation with the ListMLE ApproachACM Transactions on Asian and Low-Resource Language Information Processing10.1145/322604518:1(1-18)Online publication date: 12-Nov-2018

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media