[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.3115/992628.992708dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free access

Alignment of shared forests for bilingual corpora

Published: 05 August 1996 Publication History

Abstract

Research in example-based machine translation (EBMT) has been hampered by the lack of efficient tree alignment algorithms for bilingual corpora. This paper describes an alignment algorithm for EBMT whose running time is quadratic in the size of the input parse trees. The algorithm uses dynamic programming to score all possible matching nodes between structure-sharing trees or forests. We describe the algorithm, various optimizations, and our implementation.

References

[1]
Peter Brown, Stephen A. Della Pietra, Vincent J. Della Pietra and Robert L. Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. In "Computational Linguistics, 19: 263--312.
[2]
S. Chen. 1993. Aligning Sentences in Bilingual Corpora using lexical information. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pages 9--16, Columbus Ohio. Association for Computational Linguistics, Morristown, New Jersey.
[3]
T. H. Cormen, C. E. Leiserson and R. L. Rivest. 1990. Introduction to Algorithms, The MIT Press, Cambridge, Mass.
[4]
Martin Farach, Teresa M. Przytycka and Mikkel Thorup. 1995. The maximum agreement subtree problem for binary trees. Unpublished manuscript, Rutgers University, Odense University, and University of Copenhagen.
[5]
Martin Farach, Teresa M. Przytycka and Mikkel Thorup. 1995. On the agreement of many trees. Unpublished manuscript, Rutgers University, Odense University, and University of Copenhagen.
[6]
Osamu Furuse and Hitoshi lida. 1994. Constituent Boundary Parsing for Example-Based Machine Translation. In COLING 94 Proceedings, Volume 1, pages 105--111, Kyoto Japan.
[7]
Ralph Grishman. 1994. Iterative Aligmnent of Syntactic Structures for a Bilingual Corpus. In Proceedings of the Second Annual Workshop for Very Large Corpora, Tokyo, Japan.
[8]
Ralph Grishman and Michiko Kosaka. 1992. Combining Rationalist and Empiricist Approaches to Machine Translation. In Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation, Montreal, Canada.
[9]
Hiroyuki Kaji, Yuuko Kida and Yasututsngo Morimoto. 1992. Learning Translation Templates from Bilingual Text. In COLING 92 Proceedings.
[10]
Lauri Karttumen. 1985. Structure-Sharing with Binary Trees. In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics.
[11]
Judith Klavans and Evelyne Tzoukermann. 1990. The BICORD System. In COLING 90 Proceedings, Volume 3, pages 174--179.
[12]
Y. Matsumoto, H. Ishimoto, T. Utsuro and M. Nagao. 1993. Structural Matching of Parallel Texts. In 31st Annual Meeting of the Association for Computational Linguistics: "Proceedings of the Conference".
[13]
Makao Nagao. 1984. A Framework of a Mechanical Translation between Japanese and English by Analogy Principle. In Alick Elithorn and Ranan Banerji, editors, Artificial and Human Intelligence. Elsevier Science Publishers B. V., Amsterdam, The Netherlands.
[14]
Fernando C. N. Pereira. 1985. A Structure-Sharing Representation for Unification-Based Grammatical Formalisms. In Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics.
[15]
Satoshi Sato and Makoto Nagao. 1990. Toward Memory-based Translation. In COLING 90 Proceedings, Volume 3, pages 247--252.
[16]
Mike Steel and Tandy Warnow. 1993. Kaiokura Tree Theorems: Computing the Maximum Agreement Subtree. In Information Processing Letters, 48: 77--82.

Cited By

View all
  • (2011)Improving MT word alignment using aligned multi-stage parsesProceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation10.5555/2024261.2024271(88-97)Online publication date: 23-Jun-2011
  • (2007)Dependency-based paraphrasing for recognizing textual entailmentProceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing10.5555/1654536.1654554(83-88)Online publication date: 28-Jun-2007
  • (2005)Rooted Maximum Agreement SupertreesAlgorithmica10.5555/3118737.311884743:4(293-307)Online publication date: 1-Dec-2005
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
COLING '96: Proceedings of the 16th conference on Computational linguistics - Volume 1
August 1996
600 pages
  • Program Chair:
  • J. Tsujii

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 05 August 1996

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)31
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2011)Improving MT word alignment using aligned multi-stage parsesProceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation10.5555/2024261.2024271(88-97)Online publication date: 23-Jun-2011
  • (2007)Dependency-based paraphrasing for recognizing textual entailmentProceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing10.5555/1654536.1654554(83-88)Online publication date: 28-Jun-2007
  • (2005)Rooted Maximum Agreement SupertreesAlgorithmica10.5555/3118737.311884743:4(293-307)Online publication date: 1-Dec-2005
  • (2005)Classification of semantic relations by humans and machinesProceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment10.5555/1631862.1631863(1-6)Online publication date: 30-Jun-2005
  • (2005)Sentence fusion for multidocument news summarizationComputational Linguistics10.5555/1108994.249093631:3(297-328)Online publication date: 1-Sep-2005
  • (2004)Statistical machine translation by parsingProceedings of the 42nd Annual Meeting on Association for Computational Linguistics10.3115/1218955.1219038(653-es)Online publication date: 21-Jul-2004
  • (2003)Text simplification for reading assistanceProceedings of the second international workshop on Paraphrasing - Volume 1610.3115/1118984.1118986(9-16)Online publication date: 11-Jul-2003
  • (2002)A synchronization structure of SSTC and its applications in machine translationProceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 1610.3115/1118794.1118795(1-8)Online publication date: 1-Sep-2002
  • (2002)Corpus-based generation of numeral classifier using phrase alignmentProceedings of the 19th international conference on Computational linguistics - Volume 110.3115/1072228.1072245(1-7)Online publication date: 24-Aug-2002
  • (2002)Structure alignment using bilingual chunkingProceedings of the 19th international conference on Computational linguistics - Volume 110.3115/1072228.1072238(1-7)Online publication date: 24-Aug-2002
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media