Abstract
We present a statistical generative model for unsupervised learning of verb argument structures. The model was used to automatically induce the argument structures for the 1,500 most frequent verbs of English. In an evaluation carried out for a representative sample of verbs, more than 90% of the induced argument structures were judged correct by human subjects. The induced structures also overlap significantly with those in PropBank, exhibiting some correct patterns of usage that are not present in this manually developed semantic resource.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: The Proceed-ings of COLING/ACL, Montreal, pp. 86–90 (1998)
Bikel, D.M., Schwartz, R., Weischedel, R.M.: An Algorithm that Learns What’s in a Name. Machine Learning (1999) (Special Issue on NLP)
Brent, M.R.: Automatic acquisition of subcategorization frames from untagged text. In: The Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, pp. 209–214 (1991)
Briscoe, T., Carroll, J.: Automatic extraction of subcategorization from corpora. In: The Proceedings of the 5th ANLP Conference, Washington, D.C, pp. 356–363 (1997)
Brown, P., Cocke, J., Della Pietra, S., Della Pietra, V., Jelinek, F., Lafferty, J., Mercer, R., Roossin, P.: A statistical approach to machine translation. Computational Linguistics 16(2), 79–85 (1990)
Brown, P., Della Pietra, S., Della Pietra, V., Mercer, R.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 2(19), 263–311 (1993)
Carletta, J.: Assessing Agreement on Classification Tasks: The Kappa Statistic. Compu-tational Linguistics 22(2), 249–254 (1996)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Ser B 39, 1–38 (1977)
Framis, F.R.: An experiment on learning appropriate selection restrictions from a parsed corpus. In: the Proceedings of the International Conference on Computational Linguistics, Kyoto, Japan (1994)
Gildea, D.: Probabilistic Models of Verb-Argument Structure. In: the Proceedings of the 17th International Conference on Computational Linguistics (2002)
Gomez, F.: Building Verb Predicates: A Computational View. In: the Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 359–366 (2004)
Green, R., Dorr, B.J., Resnik, P.: Inducing Frame Semantic Verb Classes from WordNet and LDOCE. In: the Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 375–382 (2004)
Grishman, R., Sterling, J.: Acquisition of selectional patterns. In: the Proceedings of the International Conference on Computational Linguistics, Nantes, France, pp. 658–664 (1992)
Grishman, R., Sterling, J.: Generalizing Automatically Generated Selectional Pat-terns. In: The Proceedings of the 15th International Conference on Computational Linguistics, Kyoto, Japan (1994)
Kingsbury, P., Palmer, M.: From Treebank to PropBank. In: The Proceedings of the 3rd International Conference on Language Resources and Evaluation, Las Palmas (2002)
Kipper, K., Dang, H.T., Palmer, M.: Class-based Construction of a Verb Lexicon. In: The Proceedings of AAAI 17th National Conference on Artificial Intelligence, Austin, Texas (2000)
Knight, K., Marcu, D.: Summarization beyond sentence extraction: A Probabilistic Approach to Sentence Compression. Artificial Intelligence 139(1) (2002)
Korhonen, A.: Semantically Motivated Subcategorization Acquisition. In: The Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon, pp. 51–58 (2002)
Lapata, M.: Acquiring lexical generalizations from corpora: A case study for diathesis alternations. In: the Proceedings of the 37th Annual Meeting of the Association for Computa-tional Linguistics, pp. 394–404 (1999)
Levin, B.: Towards a lexical organization of English verbs. Chicago University Press, Chicago (1993)
Manning, C.: Automatic acquisition of a large subcategorization dictionary from cor-pora. In: The Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 235–242 (1993)
Marcu, D., Popescu, A.M.: Towards Developing Probabilistic Generative Models for Reasoning with Natural Language Representations. In: The Proceedings of the 6th Interna-tional Conference on Computational Linguistics and Text Processing. LNCS, vol. 2406. Springer, Mexico (2005), ISBN 3-540-24523-5
Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Marcus, M.: The Penn Treebank: A revised corpus design for extracting predicate-argument structure. In: The Proceedings of the ARPA Human Language Technology Workshop, Princeton, NJ (1994)
McCarthy, D.: Using semantic preferences to identify verbal participation in role switch-ing alternations. In: The Proceedings of the 1st NAACL, Seattle, Washington, pp. 256–263 (2000)
Merlo, P., Stevenson, S.: Automatic Verb Classification Based on Statistical Distri-butions of Argument Structure. Computational Linguistics 27(3) (2001)
Ratnaparki, A.: A Maximum Entropy Part-Of-Speech Tagger. In: The Proceedings of the Empirical Methods in Natural Language Processing Conference, University of Pennsylvania (1996)
Resnik, P.: Wordnet and distributional analysis: a class-based approach to lexical dis-covery. In: The Proceedings of AAAI Workshop on Statistical Methods in NLP (1992)
Rooth, M., Stefan, R., Prescher, D., Carroll, G., Beil, F.: Inducing a semantically anno-tated lexicon via EM-based clustering. In: The Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, pp. 104–111 (1999)
Sarkar, A., Zeman, D.: Automatic extraction of subcategorization frames for Czech. In: The Proceedings of the 18th International Conference on Computational Linguistics (2000)
Sarkar, A., Tripasai, W.: Learning Verb Argument Structures from Minimally Anno-tated Corpora. In: The Proceedings of the 19th International Conference on Computational Linguistics (2002)
Soricut, R., Brill, E.: Automatic Question Answering: Beyond the Factoid. In: The Proceedings of the Human Language Technology and North American Association for Com-putational Linguistics Conference (2004)
Voorhees, E.M., Buckland, L.P.(eds.): NIST Special Publication 500-251: The Eleventh Text REtrieval Conference (TREC 2002), Department of Commerce, National Insti-tute of Standards and Technology (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pardo, T.A.S., Marcu, D., Nunes, M.d.G.V. (2006). Unsupervised Learning of Verb Argument Structures. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2006. Lecture Notes in Computer Science, vol 3878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11671299_7
Download citation
DOI: https://doi.org/10.1007/11671299_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32205-4
Online ISBN: 978-3-540-32206-1
eBook Packages: Computer ScienceComputer Science (R0)