Abstract
We describe DériF, a rule-based morphosemantic analyzer developed for French. Unlike existing word segmentation tools, DériF provides derived and compound words with various sorts of semantic information: (1) a definition, computed from both the base meaning and the specificities of the morphological rule; (2) lexical-semantic features, inferred from general linguistic properties of derivation rules; (3) lexical relations (synonymy, (co-)hyponymy) with other, morphologically unrelated, words belonging to the same analyzed corpus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Plag, I.: Word-formation in English. Cambridge University Press, Cambridge (2003)
Cartoni, B., Lefer, M.-A.: Improving the representation of word-formation in multilingual lexicographic tools: the MuLeXFoR database. In: XIV EURALEX, pp. 581–591. Fryske Academy, Leeuwarden (2010)
Creutz, M., Lagus, K.: Inducing the Morphological Lexicon of a Natural Language from Unannotated Text. In: AKRR 2005, pp. 106–113. Pattern Recognition Society of Finland, Helsinki (2005)
Sagot, B.: The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French. In: LREC 2010, pp. 2744–2751. ELRA, La Valetta (2010)
Bernhard, D., Cartoni, B., Tribout, D.: A Task-Based Evaluation of French Morphological Resources and Tools. Linguistic Issues in Language Technology 5, 2 (2011)
Bilotti, M.W., Katz, B., Lin, J.: What Works Better for Question Answering: Stemming or Morphological Query Expansion? In: Proceedings of the Information Retrieval for Question Answering (IR4QA) (Workshop at SIGIR 2004), Sheffield (2004)
Dasgupta, S., Ng, V.: Unsupervised morphological parsing of Bengali. Language Resources and Evaluation 40(3-4), 311–330 (2006)
Goldsmith, J.: An algorithm for the unsupervised learning of morphology. Computational Linguistics 27(2), 153–198 (2001)
Cavar, D., Rodriguez, P., Schrementi, G.: Unsupervised morphology induction for part-of-speech-tagging. In: Proceedings of the 29th Annual Penn Linguistics Colloquium, vol. 12(1), pp. 29–41. University of Pennsylvania, Philadelphia (2006)
Claveau, V.: Unsupervised and semi-supervised morphological analysis for Information Retrieval in the biomedical domain. In: COLING, Mumbai, India, pp. 629–646 (2012)
Bernhard, D.: Automatic Acquisition of Semantic Relationships from Morphological Relatedness. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds.) FinTAL 2006. LNCS (LNAI), vol. 4139, pp. 121–132. Springer, Heidelberg (2006)
Clément, L., Sagot, B., Lang, B.: Morphology based automatic acquisition of large-coverage lexica. In: LREC, pp. 1841–1844. ELRA, Lisbon (2004)
Wicentowski, R.: Multilingual Noise-Robust Supervised Morphological Analysis using the WordFrame Model. In: Proceedings of 7th Meeting of the ACL Special Interest Group on Computational Phonology (SIGPHON), pp. 70–77. ACL, Barcelona (2004)
Virpioja, S., Turunen, V.T., Spiegler, S., Kohonen, O., Kurimo, M.: Empirical Comparison of Evaluation Methods for Unsupervised Learning of Morphology. TAL 42(2), 45–90 (2011)
Stroppa, N., Yvon, F.: An Analogical Learner for Morphological Analysis. In: CoNLL, pp. 120–127. ACL, Ann Arbor (2005)
Hathout, N.: Morphonette: a paradigm-based morphological network. Lingue e Linguaggio 2, 245–264 (2011)
Moreau, F., Claveau, V., Sébillot, P.: Automatic morphological query expansion using analogy-based machine learning. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 222–233. Springer, Heidelberg (2007)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Hull, A.D.: Stemming Algorithms - A case study for detailed evaluation. Journal of the American Society of Information Science 47(1), 70–84 (1996)
Juravsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, New Jersey (2000)
Cohen-Sygal, Y., Wintner, S.: Finite-State Registered Automata for Non-Concatenative Morphology. Computational Linguistics 32(1), 49–82 (2006)
Walther, M.: Temiar reduplication in one-level prosodic morphology. In: Proceedings of SIGPHON, Workshop on Finite-State Phonology, Luxembourg, pp. 13–21 (2000)
Pacak, M.G., Norton, L.M., Dunham, G.S.: Morphosemantic Analysis of -ITIS Forms in Medical Language. In: Methods of Information in Medecine, pp. 99–105 (1980)
Schulz, S., Hahn, U.: Morpheme-based, cross-lingual indexing for medical document retrieval. International Journal of Medical Informatics 58-59, 87–99 (2000)
Markó, K., Schulz, S., Hahn, U.: MorphoSaurus – design and evaluation of an interlingua-based, cross-language docuyment retrieval engine for the medical domain. Methods of Information in Medecine 44(4), 537–545 (2005)
Cartoni, B.: Lexical Morphology in Machine Translation: A Feasibility Study. In: Proceedings of the 12th EACL, pp. 130–138. ACL, Athens (2009)
Namer, F., Baud, R.: Defining and relating biomedical terms: towards a cross-language morphosemantics-based system. International Journal of Medical Informatics 76(2-3), 226–233 (2007)
Deléger, L., Namer, F., Zweigenbaum, P.: Morphosemantic parsing of medical compound words: Transferring a French analyzer to English. International Journal of Medical Informatics 78(suppl.1), 48–55 (2009)
Bernhard, D.: Apprentissage de connaissances morphologiques pour l’acquisition automatique de ressources lexicales. Université Joseph Fourier, Grenoble (2006)
Wilbur, W.J.: BioNLP: Biological, Translational and clinical language processing, pp. 201–208. ACL, Prague (2007)
Clark, P., Fellbaum, C., Hobbs, J.R., Harrison, P., Murray, B., Thompson, J.: Augmenting WordNet for deep understanding of text. In: Proceedings of Semantics in Text Processing, pp. 45–57. ACL, Venezia (2008)
Dal, G., Hathout, N., Namer, F.: Construire un lexique dérivationnel: théorie et réalisations. In: TALN 1999, pp. 115–124. Université Paris 7, Cargèse (1999)
Namer, F.: Morphologie, Lexique et TAL: l’analyseur DériF. Hermes Sciences Publishing, London (2009)
Sapir, E.: Language. Harcourt, Brace and Company, New York (1921)
Aikhenvald, A.Y.: Typological distinctions in word-formation. In: Shopen, T. (ed.) Language Typology and Syntactic Description. Grammatical Categories and the Lexicon, vol. III, pp. 1–65. Cambridge University Press, Cambridge (2007)
Corbett, G.: Canonical Derivational Morphology. Word Structure 3(2), 141–155 (2010)
Hathout, N., Namer, F.: Discrepancy between form and meaning in Word Formation: the case of over- and under-marking in French. In: Rainer, F., Dressler, W.U., Gardani, F., Luschützky, H.C. (eds.) Morphology and Meaning (Selected Papers from the 15th International Morphology Meeting), Vienna. John Benjamins, Amsterdam (2010)
Hathout, N., Namer, F.: Règles et paradigmes en morphologie informatique lexématique. In: TALN 2011, pp. 215–220. LIRMM/ATALA, Montpellier (2011)
Lüdeling, A.: Neoclassical word-formation, 2nd edn. Encyclopedia of Language and Linguistics, pp. 580–582. Elsevier (2006)
Baayen, R.H.: Quantitative aspects of morphological productivity. Yearbook of Morphology 1991, 109–149 (1992)
Namer, F., Bouillon, P., Jacquey, E.: Un lexique Génératif de référence pour le Français. In: TALN 2007, pp. 233–242. ERSS, Toulouse (2007)
Namer, F., Jacquey, E.: Word Formation Rules and the Generative Lexicon: Representing noun-to-verb versus verb-to-noun Conversion. In: Pustejovsky, J., Bouillon, P., Isahara, H., Kanzaki, K., Chungmin, L. (eds.) Advances in Generative Lexicon Theory, pp. 385–414. Springer, Heidelberg (2012)
Ruimy, N., Monachini, M., Distnte, R., Guazzini, E., Molino, S., Uliveri, M., Calzolari, N., Zampolli, A.: CLIPS, A Multi-level Italian Computational Lexicon. In: LREC, pp. 792–799. ELRA, Las Palmas de Gran Canaria (2002)
Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge (1995)
Namer, F., Bouillon, P., Jacquey, E., Ruimy, N.: Morphology-based enhancement of a French SIMPLE Lexicon. In: 5th International Conference on Generative Approaches to the Lexicon, pp. 153–161. ILC-CNR, Pisa (2009)
Chmielik, J., Grabar, N.: Détection de la spécialisation scientifique et technique des docu-ments biomédicaux grâce aux informations morphologiques. TAL 52(2), 151–179 (2011)
Cartoni, B., Zweigenbaum, P.: Extension of a specialised lexicon using specific termino-logical data: the Unified Medical Lexicon for French (UMLF). In: Proceedings of 14th EURALEX, pp. 892–905. De Skriuwers, Leeuwarden (2010)
Lieber, R., Štekauer, P.: Introduction: status and definition of compounding. In: Lieber, R., Štekauer, P. (eds.) The Oxford Handbook of Compounding, pp. 3–18. Oxford University Press, Oxford (2009)
Montermini, F.: Units in compounding. In: Scalise, S., Vogel, I. (eds.) Cross-Disciplinary Issues in Compounding, pp. 79–82. Benjamins, Amsterdam (2010)
Dal, G., Amiot, D.: La composition néoclassique en français et ordre des constituants. In: Amiot, D. (ed.) La Composition Dans une Perspective Typologique, pp. 89–113. Artois Presse Université, Arras (2008)
Namer, F.: Guessing the meaning of neoclassical compound within LG: the case of pathol-ogy nouns. In: 3d Workshop on Generative Approaches to the Lexicon, pp. 175–184. Université de Genève, Geneva (2005)
Quintard, L., Galibert, O., Adda, G., Grau, B., Laurent, D., Moriceau, V.R., Rosset, S., Tannier, X., Vilnat, A.: Question Answering on Web Data: The QA Evaluation in Quæro. In: LREC 2010, pp. 2368–2374. ELRA, La Valletta (2010)
Ayache, C., Grau, B., Vilnat, A.: EQueR: the French Evaluation campaign of Question-Answering Systems. In: LREC 2006, pp. 1157–1160. ELRA, Genova (2006)
Grappy, A., Grau, B., Ferret, O., Grouin, C., Moriceau, V.R., Robba, I., Tannier, X., Vilnat, A., Barbier, V.: A Corpus for Studying Full Answer Justification. In: LREC 2010, pp. 2361–2367. ELRA, La Valletta (2010)
Namer, F.: Analyse automatique des noms déverbaux composés: pourquoi et comment faire intéragir analogie et système de règles. In: TALN 2009, pp. 1–10. ATALA, Senlis (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Namer, F. (2013). A Rule-Based Morphosemantic Analyzer for French for a Fine-Grained Semantic Annotation of Texts. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2013. Communications in Computer and Information Science, vol 380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40486-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-40486-3_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40485-6
Online ISBN: 978-3-642-40486-3
eBook Packages: Computer ScienceComputer Science (R0)