[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.3115/1072228.1072394dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free access

An approach based on multilingual thesauri and model combination for bilingual lexicon extraction

Published: 24 August 2002 Publication History

Abstract

This paper focuses on exploiting different models and methods in bilingual lexicon extraction, either from parallel or comparable corpora, in specialized domains. First, a special attention is given to the use of multilingual thesauri, and different search strategies based on such thesauri are investigated. Then, a method to combine the different models for bilingual lexicon extraction is presented. Our results show that the combination of the models significantly improves results, and that the use of the hierarchical information contained in our thesaurus, UMLS/MeSH, is of primary importance. Lastly, methods for bilingual terminology extraction and thesaurus enrichment are discussed.

References

[1]
Blank, I., 2000. Terminology extraction from parallel technical texts. In J. Veronis (Ed.), Parallel Text Processing - Alignment and Use of Translation Corpora. Kluwer Academic Publishers.
[2]
Dunning, T., 1993. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):64--74.
[3]
Fung, P., 2000. A statistical view on bilingual lexicon extraction - From parallel corpora to non-parallel corpora. In J. Veronis (Ed.), Parallel Text Processing - Alignment and Use of Translation Corpora. Kluwer Academic Publishers.
[4]
Gaussier, E., Hull, D., Ait-Mokhtar, S., 2000. Term alignment in use: Machine-aided human translation. In J. Veronis (Ed.), Parallel Text Processing Alignment and Use of Translation Corpora. Kluwer Academic Publishers.
[5]
Heid, U., 1999. A linguistic bootstrapping approach to the extraction of term candidates from German text. Terminology, 5(2).
[6]
Hull, D., 1997. Automating the construction of bilingual terminology, lexicons. Terminology, 4(2).
[7]
Peters, C., Picchi, E., 1995. Capturing the comparable: A system for querying comparable text corpora. JADT Proceedings.
[8]
Rapp, R., 1999. Automatic identification of word translations from unrelated English and German corpora, ACL Proceedings.
[9]
Shahzad, I., Ohtake, K., Masuyama, S. Yamamoto, K., 1999. Identifying translations of compound nouns using non-aligned corpora. Workshop MAL Proceedings.
[10]
Tanaka, K., Iwasaki, H., 1996. Extraction of lexical translations from non-aligned corpora. COLING Proceedings.
[11]
Vivaldi, J., Rodriguez, H., 2001. Improving term extraction by combining different techniques. Terminology, 7(1).

Cited By

View all
  • (2017)A Generalized Constraint Approach to Bilingual Dictionary Induction for Low-Resource Language FamiliesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/313881517:2(1-29)Online publication date: 13-Nov-2017
  • (2017)Corpus-Based Translation Induction in Indian Languages Using Auxiliary Language Corpora from WikipediaACM Transactions on Asian and Low-Resource Language Information Processing10.1145/303829516:3(1-25)Online publication date: 17-Mar-2017
  • (2016)Topic-based term translation models for statistical machine translationArtificial Intelligence10.1016/j.artint.2015.12.002232:C(54-75)Online publication date: 1-Mar-2016
  • Show More Cited By
  1. An approach based on multilingual thesauri and model combination for bilingual lexicon extraction

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1
      August 2002
      1184 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 24 August 2002

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)27
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 03 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)A Generalized Constraint Approach to Bilingual Dictionary Induction for Low-Resource Language FamiliesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/313881517:2(1-29)Online publication date: 13-Nov-2017
      • (2017)Corpus-Based Translation Induction in Indian Languages Using Auxiliary Language Corpora from WikipediaACM Transactions on Asian and Low-Resource Language Information Processing10.1145/303829516:3(1-25)Online publication date: 17-Mar-2017
      • (2016)Topic-based term translation models for statistical machine translationArtificial Intelligence10.1016/j.artint.2015.12.002232:C(54-75)Online publication date: 1-Mar-2016
      • (2015)Multilingual Topic Models for Bilingual Dictionary ExtractionACM Transactions on Asian and Low-Resource Language Information Processing10.1145/269993914:3(1-22)Online publication date: 12-Jun-2015
      • (2014)Improving Bilingual Lexicon Extraction from Comparable Corpora Using Window-Based and Syntax-Based ModelsProceedings of the 15th International Conference on Computational Linguistics and Intelligent Text Processing - Volume 840410.1007/978-3-642-54903-8_26(310-323)Online publication date: 6-Apr-2014
      • (2012)Bilingual lexicon extraction from comparable corpora using label propagationProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2390952(24-36)Online publication date: 12-Jul-2012
      • (2012)Detecting highly confident word translations from comparable corpora without any prior knowledgeProceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics10.5555/2380816.2380872(449-459)Online publication date: 23-Apr-2012
      • (2012)Statistical Extraction and Comparison of Pivot Words for Bilingual Lexicon ExtensionACM Transactions on Asian Language Information Processing10.1145/2184436.218443911:2(1-31)Online publication date: 1-Jun-2012
      • (2012)QAlignProceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II10.1007/978-3-642-28601-8_8(83-96)Online publication date: 11-Mar-2012
      • (2011)Bilingual lexicon extraction from comparable corpora as metasearchProceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web10.5555/2024236.2024244(35-43)Online publication date: 24-Jun-2011
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media