Abstract
Proper treatment of collocations constitutes a serious challenge for NLP systems in general. This paper describes how Fips, a “Principle and Parameters” grammar-based parser developed at LATL handles multi-word expressions. In order to get more precise and more reliable collocation data, the Fips parser is used to extract collocations from large text corpora. It will be shown that collocational information can help ranking alternative analyses computed by the parser, in order to improve the quality of its results.
Thanks to Paola Merlo, Luka Nerima, Juri Mengon, and Stephanie Durrleman for comments on an earlier version of this paper. This research was supported in part by a grant from the Swiss Commission for technology and innovation (CTI).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abeillé, A. & Schabes, Y.: “Parsing Idioms in lexicalized TAGs”, Proceedings of EACL-89, Manchester, (1989) 1–9.
Arnold, D., Balkan, L., Lee Humphrey, R., Meijer, S., Sadler, L.: Machine Translation: An Introductory Guide, HTML document (http://clwww.essex.ac.uk), (1995).
Alshawi, H. & Carter, D.: “Training and scaling preference functions for disambiguation” Computational Linguistics 20:4, (1994) 635–648.
Benson, M.: “Collocations and general-purpose dictionaries” Internation Journal of Lexicography 3:1, (1990) 23–35.
Berthouzoz, C. & Merlo, P.: “Statistical Ambiguity Resolution for Grammar-based Parsing”, Recent Advances i Natural Language processing: Selected Papers form RANLP97, Current Issues in Linguistc Theory, Nicolas Nicolov et Ruslan Mitkov (eds.), John Benjamins, Amsterdam/Philadelphia, (1998).
Chomsky, N. & Lasnik, H.: “The Theory of Pinciples and Parameters” in Chomsky, N. The Minimalist Program, Cambridge, MIT Press, (1995) 13–127.
Church, K., Gale, W., Hanks, P., Hindle, D.: “Using Statistics in Lexical Analysis”, in Zernick, U. (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Lawrence Erlbaum Associates, (1991) 115–164.
Hindle, D. & Roots, M.: “Structural Ambiguity and Lexical Relations” Computational Linguistics 19:1, (1993) 103–120.
Laporte, E.: “Reconnaissance des expressions figées lors de l’analyse automatique”, Langages 90, Larousse, Paris, (1988).
Lin, D.: “Extracting Collocations from Text Corpora”, First Workshop on Computational Terminology, Montreal, (1998).
McCord, M.C.: “Heuristics for broad-coverage natural language parsing”, Proceedings ARPA Human Technology Workshop, Los Altos, Morgan Kaufmann, (1993) 127–132.
Nunberg, G., Sag, I., Wasow, T.: “Idioms”, Language, 70:3, (1994) 491–538.
Ruwet, N.: “Du bon Usage des Expressions Idiomatiques dans l’argumentation en syntaxe générative”. In Revue québécoise de linguistique. 13:1, (1983).
Schenk, A.: ‘The Syntactic Behavior of Idioms’. In Everaert M., van der Linden E., Schenk, A., Schreuder, R. Idioms: Structural and Psychological Perspectives, Lawrence Erlbaum Associates, Hove, (1995).
Segond, D. & Breidt, E.: “IDAREX: description formelle des expressions á mots multiples en français et en allemand” in A. Clas, Ph. Thoiron and H. Béjoint (eds.) Lexicomatique et dictionnairiques, Montreal, Aupelf-Uref, (1996).
Smadja, F.: “Reitrieving collocations form text: X-tract”, Computational Linguistics 19:1, (1993) 143–177.
Stock, O.: “Parsing with Flexibility, Dynamic Strategies, and Idioms in Mind”, Computational Linguistics, 15.1., (1989) 1–18.
Volk, M.: “The Automatic Translation of Idioms: Machine Translation vs Translation Memory Systems” in Nico Weber (ed.) Machine Translation: Theory, Applications, and Evaluation, An Assessment of the State-of-the-art, St. Augustin, Gardez Verlag, (1998).
Wanner, L.: “On the representation of collocations in a multilingual computational lexicon”, TAL 40:1, (1999) 55–86.
Wehrli, E.: L’analyse syntaxique des langues naturelles: problémes et méthodes, Paris, Masson, (1997).
Wehrli, E.: “Translating Idioms”, COLING-98, Montreal, (1998) 1388–1392.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wehrli, E. (2000). Parsing and Collocations. In: Christodoulakis, D.N. (eds) Natural Language Processing — NLP 2000. NLP 2000. Lecture Notes in Computer Science(), vol 1835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45154-4_26
Download citation
DOI: https://doi.org/10.1007/3-540-45154-4_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67605-8
Online ISBN: 978-3-540-45154-9
eBook Packages: Springer Book Archive