[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1626344.1626353dlproceedingsArticle/Chapter ViewAbstractPublication PagesssstConference Proceedingsconference-collections
research-article
Free access

Reordering model using syntactic information of a source tree for statistical machine translation

Published: 05 June 2009 Publication History

Abstract

This paper presents a reordering model using syntactic information of a source tree for phrase-based statistical machine translation. The proposed model is an extension of ISTITG (imposing source tree on inversion transduction grammar) constraints. In the proposed method, the target-side word order is obtained by rotating nodes of the source-side parse-tree. We modeled the node rotation, monotone or swap, using word alignments based on a training parallel corpus and sourceside parse-trees. The model efficiently suppresses erroneous target word orderings, especially global orderings. Furthermore, the proposed method conducts a probabilistic evaluation of target word reorderings. In English-to-Japanese and English-to-Chinese translation experiments, the proposed method resulted in a 0.49-point improvement (29.31 to 29.80) and a 0.33-point improvement (18.60 to 18.93) in word BLEU-4 compared with IST-ITG constraints, respectively. This indicates the validity of the proposed reordering model.

References

[1]
Adam L. Berger, Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, Andrew S. Kehler, and Robert L. Mercer 1996. Language translation apparatus and method of using context-based translation models. United States patent, patent number 5510981.
[2]
Eugene Charniak. 2000. A Maximum-Entropy-Inspired Parser. In Proceedings of NAACL 2000, pages 132--139.
[3]
Chasen http://chasen-legacy.sourceforge.jp/
[4]
Heidi J. Fox, 2002. Phrasal cohesion and statistical machine translation. In Proceedings of EMNLP, pages 304--311.
[5]
Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What's in a translation rule? In Proceedings of HLT/NAACL-04.
[6]
Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical Syntax-Directed Translation with Extended Domain of Locality. In Proceedings of AMTA.
[7]
Japanese-English paper abstract corpus http://www.jst.go.jp
[8]
Reinhard Kneser and Hermann Ney. 1995. Improved backing-off for m-gram language model In Proceedings of ICASSP 1995, pages 181--184.
[9]
Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of HLT-NAACL 2003, pages 127--133.
[10]
Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight. 2006. SPMT: Statistical Machine Translation with Syntactified Target Language Phrases In Proceedings of EMNLP2006, pages 44--52.
[11]
Dan Melamed. 2004. Statistical machine translation by parsing In Proceedings of ACL, pages 653--660.
[12]
Moses http://www.statmt.org/moses/
[13]
Franz josef Och and Hermann Ney. 2003. A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 29(1), pages 19--51.
[14]
Franz josef Och. 2003. Minimum error rate training for statistical machine translation. In Proceedings of ACL, pages 160--167.
[15]
Franz josef Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Computational Linguistics, 30(4), pages 417--449.
[16]
Chris Quirk, Arul Menezes, and Colin Cherry. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of ACL, pages 271--279.
[17]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of ACL, pages 311--318.
[18]
Andreas Stolcke. 2002. SRILM - An Extensible Language Model Toolkit In Proceedings of ICSLP2002, pages 901--904. http://www.speech.sri.com/projects/srilm/
[19]
Christopher Tillmann. 2004. A unigram orientation model for statistical machine translation. In Proceedings of HLT-NAACL, pages 101--104.
[20]
Masao Uchiyama and Hitoshi Isahara. 2007. 2007. A japanese-english patent parallel corpus. In MT summit XI, pages 475--482.
[21]
Dekai Wu. 1995. Stochastic inversion transduction grammars, with application to segmentation, bracketing, and alignment of parallel corpora. In Proceedings of IJCAI, pages 1328--1334.
[22]
Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguiatics, 23(3), pages 377--403.
[23]
Kenji Yamada and Kevin Knight. 2000. A syntax-based statistical translation model In Proceedings of ACL, pages 523--530.
[24]
Hirofumi Yamamoto, Hideo Okuma, and Eiichiro Sumita. 2008. Imposing Constraints from the Source Tree on ITG Constraints for SMT. In Proceedings of ACL: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2), pages 1--9.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
SSST '09: Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
June 2009
99 pages
ISBN:9781932432398
  • Program Chairs:
  • Dekai Wu,
  • David Chiang

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 05 June 2009

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 202
    Total Downloads
  • Downloads (Last 12 months)31
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media