Abstract
State of the art parsers are currently trained on converted versions of Penn Treebank into dependency representations, which however don’t include null elements. This is done to facilitate structural learning and prevents the probabilistic engine to postulate the existence of deprecated null elements everywhere, see [19]. However it is a fact that in this way the semantics of the representation used and produced is inconsistent and will reduce dramatically its usefulness in real life applications, like Q/A and other semantically driven fields, by hampering the mapping of a complete logical form. What systems have come up with are “quasi”-logical forms or partial logical forms mapped directly from the surface representation in dependency structure. We show the most common problems derived from the conversion and then describe an algorithm that we have implemented to apply to our converted Italian Treebank, that can be used on any CoNLL-like treebank or representation to produce an almost complete semantically consistent dependency treebank.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Afonso, S., Eckhard, B., Renato H., Diana S. : Floresta sintá(c)tica: a treebank for Portuguese. In: Rodríguez, M.G., Araujo, C.P. (eds.) Proceedings of LREC 2002, pp. 1698–1703. ELRA, Spain (2002)
Attardi, G.: Experiments with a multilanguage non-projective dependency parser. In: Proceedings of the Tenth Conference on Natural Language Learning, New York (2006)
Bikel, D.M.: Intricacies of Collins’ parsing model. Comput. Linguist. 30(4), 479–511 (2003)
Black, E., Abney, S., Flickinger, D., Gdaniec, C., Grishman, R., Harrison, P., Hindle, D., Ingria, R., Jelinek, F., Klavans, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., Strzalkowski, T.: A procedure for quantitatively comparing the syntactic coverage of English grammars. In: Proceedings of the DARPA Speech and Natural Language Workshop, pp. 306–311 (1991)
Brants, T.: TnT: a statistical part-of-speech tagger. In: ANLP 2000. Seattle (2000)
Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21, 543–565 (1995)
Carroll, J., Briscoe, T., Sanfilippo, A.: Parser evaluation: a survey and a new proposal. In: Proceedings of the [First] International Conference on Language Resources and Evaluation, pp. 447–454 (1998)
Collins, Michael, : A new statistical parser based on bigram lexical dependencies. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pp. 184–191 (1996)
Corazza, A., Lavelli, A., Satta, G., Zanoli, R.: Analyzing an Italian treebank with state-of-the-art statistical parsers. In: Proceedings of the 3rd Workshop on Treebanks and Linguistic Theories (TLT-2004), pp. 39–50. Tübingen, Germany (2004)
Delmonte, R., Bristot, A., Tonelli, S.: VIT—Venice Italian Treebank: syntactic and quantitative features. In: De Smedt, K., Hajic, J. Kübler, S. (eds.) Proceedings Sixth International Workshop on Treebanks and Linguistic Theories. Nealt Pnealt Proceedings Series, vol. 1, pp. 43–54 (2007)
Delmonte, R., Luminita, C., Ciprian, B. : Elementary trees for syntactic and statistical disambiguation. In: Proceedings TAG\(+\)5, pp. 237–240. Paris (2000)
Delmonte, R.: From shallow parsing to functional structure. In: Atti del Workshop AI*IA— “Elaborazione del Linguaggio e Riconoscimento del Parlato”, pp. 8–19. IRST, Trento (1999)
Delmonte, R.: How to annotate linguistic information in FILES and SCAT. In: Atti del Workshop “La Treebank Sintattico-Semantica dell’Italiano di SI-TAL”, pp. 75–84. Bari (2001)
Delmonte, R.: Strutture Sintattiche dall’Analisi Computazionale di Corpora di Italiano. In: Anna Cardinaletti(a cura di), Intorno all’Italiano Contemporaneo, pp. 187–220. Franco Angeli, Milano (2004)
Delmonte, R., Dolci, R.: Parsing Italian with a context-free recognizer. Annali di Ca’ Foscari XXVIII(1–2), 123–161 (1989)
Delmonte, R.: Shallow Parsing and Functional Structure in Italian Corpora, pp. 113–119. LREC, Atene (2000)
Delmonte, R.: Treebanking in VIT: from phrase structure to dependency representation. In: Nirenburg, Sergei (ed.) Language Engineering for Lesser-Studied Languages, pp. 51–80. IOS Press, The Netherlands (2009)
Delmonte, R.: Computational Linguistic Text Processing—Lexicon Grammar Parsing and Anaphora Resolution. Nova Science Publishers, New York (2009)
Gaizauskas R.: Investigations into the grammar underlying the Penn treebank II. Technical Report CS-95-25, Department of Computer Science, University of Sheffield (1995)
Harper, M.P., Helzerman, R.A.: Extensions to constraint dependency parsing for spoken language processing. Comput. Speech Lang. 9, 187–234 (1995)
Hellwig, P.: Dependency unification grammar. In: Proceedings COLING-86, pp. 195–198 (1986)
Hudson, R.: Word Grammar. Blackwell, London (1984)
Hudson, R.: English Word Grammar. Blackwell, London (1990)
Jackendoff, R.: X-Bar Syntax. The MIT Press, Cambridge (1977)
Jaervinen, T., Tapanainen, P.: Towards an implementable dependency grammar. In: Kahane, S., Polguère, A. (eds.) Proceedings of the Workshop on Processing of Dependency-Based Grammars, pp. 1–10 (1998)
Lesmo, L., Lombardo, V., Bosco, C.: Treebank development: the TUT approach. In: Proceedings of ICON 2002. Mumbai (2002)
Marcus, M., et al.: Building a large annotated corpus of English: the Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993)
Martí, M.A., Taulé, M., Márquez, L., Bertran, M.: Ancora: A Multilingual and Multilevel Annotated Corpus in http://clic.ub.edu/ancora/publications/ (2007)
Maruyama, H.: Structural disambiguation with constraint propagation. In: Proceedings of the 28th Meeting of the Association for Computational Linguistics (ACL), pp. 31–38. Pittsburgh (1990)
Mel’cuk, I.: Dependency Syntax: Theory and Practice. State University of New York Press, New York (1988)
Menzel, W., Schroeder, I.: Decision procedures for dependency parsing using graded constraints. In: Kahane, S., Polguère, A. (eds.) Proceedings of the Workshop on Processing of Dependency-Based Grammars, pp. 78–87 (1998)
Montemagni, et al.: The Italian Syntactic-Semantic Treebank: Architecture, Annotation, Tools and Evaluation, pp. 18–27. LINC, ACL, Luxembourg (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Delmonte, R. (2015). Dependency Treebank Annotation and Null Elements: An Experiment with VIT. In: Basili, R., Bosco, C., Delmonte, R., Moschitti, A., Simi, M. (eds) Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project. Studies in Computational Intelligence, vol 589. Springer, Cham. https://doi.org/10.1007/978-3-319-14206-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-14206-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14205-0
Online ISBN: 978-3-319-14206-7
eBook Packages: EngineeringEngineering (R0)