More Web Proxy on the site http://driver.im/

article

Improve syntax-based translation using deep syntactic structures

Authors:

Takuya Matsuzaki,

Jun'Ichi TsujiiAuthors Info & Claims

Machine Translation, Volume 24, Issue 2

Pages 141 - 157

https://doi.org/10.1007/s10590-010-9081-6

Published: 01 June 2010 Publication History

Abstract

This paper introduces deep syntactic structures to syntax-based Statistical Machine Translation (SMT). We use a Head-driven Phrase Structure Grammar (HPSG) parser to obtain the deep syntactic structures of a sentence, which include not only a fine-grained syntactic property description but also a semantic representation. Considering the abundant information included in the deep syntactic structures, it is interesting to investigate whether or not they improve the traditional syntax-based translation models based on PCFG parsers. In order to use deep syntactic structures for SMT, this paper focuses on extracting tree-to-string translation rules from aligned HPSG tree---string pairs. The major challenge is to properly localize the non-local relations among nodes in an HPSG tree. To localize the semantic dependencies among words and phrases, which can be inherently non-local, a minimum covering tree is defined by taking a predicate word and its lexical/phrasal arguments as the frontier nodes. Starting from this definition, a linear-time algorithm is proposed to extract translation rules through one-time traversal of the leaf nodes in an HPSG tree. Extensive experiments on a tree-to-string translation system testified the effectiveness of our proposal.

References

[1]

Birch A, Osborne M, Koehn P (2007) CCG supertags in factored statistical machine translation. In: Proceedings of the second workshop on statistical machine translation, Prague, pp 9-16.

[2]

Carpenter B (1992) The logic of typed feature structures. Cambridge University Press, New York.

[3]

Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL, Ann Arbor, MI, pp 263-270.

[4]

Chiang D (2007) Hierarchical phrase-based translation. Comput Lingust 33(2):201-228.

Digital Library

[5]

Ding Y, Palmer M (2005) Machine translation using probabilistic synchronous dependency insertion grammers. In: Proceedings of ACL, Ann Arbor, pp 541-548.

[6]

Galley M, Hopkins M, Knight K, Marcu D (2004) What's in a translation rule? In: Proceedings of HLT-NAACL.

[7]

Galley M, Graehl J, Knight K, Marcu D, De Neefe S, Wang W, Thayer I (2006) Scalable inference and training of context-rich syntactic translation models. In: Proceedings of COLING-ACL, Sydney, pp 961-968.

[8]

Hassan H, Sima'an K, Way A (2007) Supertagged phrase-based statistical machine translation. In: Proceedings of ACL, pp 288-295.

[9]

Huang L, Knight K, Joshi A (2006) Statistical syntax-directed translation with extended domain of locality. In: Proceedings of 7th AMTA, Boston, MA.

[10]

Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Lin D, Wu D (eds) Proceedings of EMNLP 2004, pp 388-395.

[11]

Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the ACL 2007 demo and poster sessions, pp 177-180.

[12]

Li Z, Callison-Burch C, Dyery C, Ganitkevitch J, Khudanpur S, Schwartz L, Thornton WNG, Weese J, Zaidan OF (2009) Demonstration of Joshua: an open source toolkit for parsing-based machine translation. In: Proceedings of the ACL-IJCNLP 2009 software demonstrations, pp 25-28.

[13]

Liu Y, Liu Q, Lin S (2006) Tree-to-string alignment templates for statistical machine translation. In: Proceedings of COLING-ACL, pp 609-616.

[14]

Liu Y, Lü Y, Liu Q (2009a) Improving tree-to-tree translation with packed forests. In: Proceedings of ACL-IJCNLP, Suntec, Singapore, pp 558-566.

[15]

Liu Y, Mi H, Feng Y, Liu Q (2009b) Joint decoding with multiple translation models. In: Proceedings of ACL-IJCNLP, pp 576-584.

[16]

Mi H, Huang L (2008) Forest-based translation rule extraction. In: Proceedings of EMNLP, Honolulu, Hawaii, pp 206-214.

[17]

Mi H, Huang L, Liu Q (2008) Forest-based translation. In: Proceedings of ACL-08:HLT, Columbus, OH, pp 192-199.

[18]

Miyao Y, Tsujii J (2008) Feature forest models for probabilistic HPSG parsing. Comput Lingust 34(1):35-80.

Digital Library

[19]

Miyao Y, Ninomiya T, Tsujii J (2003) Probabilistic modeling of argument structures including nonlocal dependencies. In: Proceedings of the international conference on recent advances in natural language processing, Borovets, pp 285-291.

[20]

Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of ACL, pp 160-167.

[21]

Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19-51.

Digital Library

[22]

Oepen S, Velldal E, Lønning JT, Meurer P, Rosén V (2007) Towards hybrid quality-oriented machine translation--on linguistics and probabilities in MT. In: Proceedings of the 11th international conference on theoretical and methodological issues in machine translation (TMI-07).

[23]

Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL, pp 311-318.

[24]

Pollard C, Sag IA (1994) Head-driven phrase structure grammar. University of Chicago Press, Chicago.

[25]

Quirk C, Menezes A, Cherry C (2005) Dependency treelet translation: syntactically informed phrasal SMT. In: Proceedings of ACL, pp 271-279.

[26]

Riezler S, Maxwell JT III (2006) Grammatical machine translation. In: Proceedings of HLT-NAACL, Morristown, NJ, USA, pp 248-255.

[27]

Sag IA, Wasow T, Bender EM (2003) Syntactic theory: a formal introduction. Number 152 in CSLI lecture notes. CSLI Publications, Stanford.

[28]

Shen L, Xu J, Weischedel R (2008) A new string-to-dependency machine translation algorithm with a target dependency language model. In: Proceedings of ACL-08:HLT, Columbus, OH, pp 577-585.

[29]

Stolcke A (2002) SRILM--an extensible language modeling toolkit. In: Proceedings of international conference on spoken language processing, pp 901-904.

[30]

Utiyama M, Isahara H (2007) A Japanese-English patent parallel corpus. In: Proceedings of MT summit XI, Copenhagen, pp 475-482.

[31]

Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23(3):377-403.

Digital Library

[32]

Zaidan OF (2009) Z-MERT: a fully configurable open source tool for minimum error rate training of machine translation systems. Prague Bull Math Linguist 91:79-88.

[33]

Zhang H, Zhang M, Li H, Aw A, Tan CL (2009) Forest-based tree sequence to string translation model. In: Proceedings of ACL-IJCNLP, Suntec, Singapore, pp 172-180.

Improve syntax-based translation using deep syntactic structures

Recommendations

Decoding with syntactic and non-syntactic phrases in a syntax-based machine translation system
SSST '09: Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation

A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the same time, it is desirable to include ...
Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation

The poor grammatical output of Machine Translation (MT) systems appeals syntax-based approaches within language modeling. However, previous studies showed that syntax-based language modeling using (Context-Free) Treebank Grammars was not very helpful in ...
Large aligned treebanks for syntax-based machine translation

We present a collection of parallel treebanks that have been automatically aligned on both the terminal and the non-terminal constituent level for use in syntax-based machine translation. We describe how they were constructed and applied to a syntax- ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Machine Translation

Machine Translation Volume 24, Issue 2

June 2010

105 pages

ISSN:0922-6567

Issue’s Table of Contents

Copyright © Copyright © 2010 Springer Science+Business Media B.V.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 June 2010

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents