[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.3115/981967.981984dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Inside-outside reestimation from partially bracketed corpora

Published: 28 June 1992 Publication History

Abstract

The inside-outside algorithm for inferring the parameters of a stochastic context-free grammar is extended to take advantage of constituent information (constituent bracketing) in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can achieve faster convergence and better modeling of hierarchical structure than the original one. In particular, over 90% test set bracketing accuracy was achieved for grammars inferred by our algorithm from a training set of handparsed part-of-speech strings for sentences in the Air Travel Information System spoken language corpus. Finally, the new algorithm has better time complexity than the original one when sufficient bracketing is provided.

References

[1]
J. K. Baker. 1979. Trainable grammars for speech recognition. In Jared J. Wolf and Dennis H. Klatt, editors, Speech communication papers presented at the 97th Meeting of the Acoustical Society of America, MIT, Cambridge, MA, June.
[2]
E. Black, S. Abney, D. Flickenger, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini, and T. Strzalkowski. 1991. A procedure for quantitatively comparing the syntactic coverage of english grammars. In DARPA Speech and Natural Language Workshop, pages 306--311, Pacific Grove, California. Morgan Kaufmann.
[3]
T. Booth. 1969. Probabilistic representation of formal languages. In Tenth Annual IEEE Symposium on Switching and Automata Theory, October.
[4]
Eric Brill, David Magerman, Mitchell Marcus, and Beatrice Santorini. 1990. Deducing linguistic structure from the statistics of large corpora. In DARPA Speech and Natural Language Workshop. Morgan Kaufmann, Hidden Valley, Pennsylvania, June.
[5]
T. Fujisaki, F. Jelinek, J. Cocke, E. Black, and T. Nishino. 1989. A probabilistic parsing method for sentence disambiguation. In Proceedings of the International Workshop on Parsing Technologies, Pittsburgh, August.
[6]
Charles T. Hemphill, John J. Godfrey, and George R. Doddington. 1990. The ATIS spoken language systems pilot corpus. In DARPA Speech and Natural Language Workshop, Hidden Valley, Pennsylvania, June.
[7]
F. Jelinek, J. D. Lafferty, and R. L. Mercer. 1990. Basic methods of probabilistic context free grammars. Technical Report RC 16374 (72684), IBM, Yorktown Heights, New York 10598.
[8]
Frederick Jelinek, Robert L. Mercer, and Salim Roukos. 1992. Principles of lexical language modeling for speech recognition. In Sadaoki Furui and M. Mohan Sondhi, editors, Advances in Speech Signal Processing, pages 651--699. Marcel Dekker, Inc., New York, New York.
[9]
K. Lari and S. J. Young. 1990. The estimation of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35--56.
[10]
K. Lari and S. J. Young. 1991. Applications of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language, 5:237--257.
[11]
David Magerman and Mitchell Marcus. 1990. Parsing a natural language using mutual information statistics. In AAAI-90, Boston, MA.
[12]
Yves Schabes. 1992. Stochastic lexicalized tree-adjoining grammars. In COLING 92. Forthcoming.

Cited By

View all
  • (2014)When Errors Become the RuleACM Computing Surveys10.1145/253418946:4(1-51)Online publication date: 1-Apr-2014
  • (2012)Identifiability and unmixing of latent parse treesProceedings of the 26th International Conference on Neural Information Processing Systems - Volume 110.5555/2999134.2999303(1511-1519)Online publication date: 3-Dec-2012
  • (2012)Three dependency-and-boundary models for grammar inductionProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391024(688-698)Online publication date: 12-Jul-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '92: Proceedings of the 30th annual meeting on Association for Computational Linguistics
June 1992
346 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 28 June 1992

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2014)When Errors Become the RuleACM Computing Surveys10.1145/253418946:4(1-51)Online publication date: 1-Apr-2014
  • (2012)Identifiability and unmixing of latent parse treesProceedings of the 26th International Conference on Neural Information Processing Systems - Volume 110.5555/2999134.2999303(1511-1519)Online publication date: 3-Dec-2012
  • (2012)Three dependency-and-boundary models for grammar inductionProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391024(688-698)Online publication date: 12-Jul-2012
  • (2012)A feature-rich constituent context model for grammar inductionProceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 210.5555/2390665.2390670(17-22)Online publication date: 8-Jul-2012
  • (2012)Spectral learning of latent-variable PCFGsProceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 110.5555/2390524.2390556(223-231)Online publication date: 8-Jul-2012
  • (2012)Rediscovering ACL discoveries through the lens of ACL anthology network citing sentencesProceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries10.5555/2390507.2390509(1-12)Online publication date: 10-Jul-2012
  • (2012)Capitalization cues improve dependency grammar inductionProceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure10.5555/2390426.2390430(16-22)Online publication date: 7-Jun-2012
  • (2012)Probabilistic Grammar Induction in an Incremental Semantic FrameworkRevised Selected Papers of the 7th International Workshop on Constraint Solving and Language Processing - Volume 811410.1007/978-3-642-41578-4_6(92-107)Online publication date: 13-Sep-2012
  • (2012)The Role of Universal Constraints in Language AcquisitionRevised Selected Papers of the 7th International Workshop on Constraint Solving and Language Processing - Volume 811410.1007/978-3-642-41578-4_1(1-13)Online publication date: 13-Sep-2012
  • (2012)Building a hierarchical annotated corpus of urduProceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I10.1007/978-3-642-28604-9_6(66-79)Online publication date: 11-Mar-2012
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media