More Web Proxy on the site http://driver.im/

research-article

Free access

A novel discriminative framework for sentence-level discourse analysis

Authors:

Giuseppe Carenini,

Raymond T. NgAuthors Info & Claims

EMNLP-CoNLL '12: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Pages 904 - 915

Published: 12 July 2012 Publication History

Abstract

We propose a complete probabilistic discriminative framework for performing sentence-level discourse analysis. Our framework comprises a discourse segmenter, based on a binary classifier, and a discourse parser, which applies an optimal CKY-like parsing algorithm to probabilities inferred from a Dynamic Conditional Random Field. We show on two corpora that our approach outperforms the state-of-the-art, often by a wide margin.

References

[1]

Or Biran and Owen Rambow. 2011. Identifying Justifications in Written Dialogs by Classifying Text as Argumentative. Int. J. Semantic Computing, 5(4): 363--381.

[2]

J. Blitzer, 2008. Domain Adaptation of Natural Language Processing Systems. PhD thesis, University of Pennsylvania.

Digital Library

[3]

L. Breiman. 1996. Bagging predictors. Machine Learning, 24(2): 123--140, August.

[4]

L. Carlson and D. Marcu. 2001. Discourse Tagging Reference Manual. Technical Report ISI-TR-545, University of Southern California Information Sciences Institute.

[5]

L. Carlson, D. Marcu, and M. Okurowski. 2002. RST Discourse Treebank (RST-DT) LDC2002T07. Linguistic Data Consortium, Philadelphia.

[6]

E. Charniak and M. Johnson. 2005. Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 173--180, NJ, USA. ACL.

Digital Library

[7]

E. Charniak. 2000. A Maximum-Entropy-Inspired Parser. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, pages 132--139, Seattle, Washington. ACL.

Digital Library

[8]

M. Collins. 2003. Head-Driven Statistical Models for Natural Language Parsing. Computational Linguistics, 29(4): 589--637, December.

Digital Library

[9]

H. Daume. 2007. Frustratingly Easy Domain Adaptation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pages 256--263, Prague, Czech Republic. ACL.

[10]

D. duVerle and H. Prendinger. 2009. A Novel Discourse Parser based on Support Vector Machine Classification. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 665--673, Suntec, Singapore. ACL.

Digital Library

[11]

J. Finkel, A. Kleeman, and C. Manning. 2008. Efficient, Feature-based, Conditional Random Field Parsing. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 959--967, Columbus, Ohio, USA. ACL.

[12]

S. Fisher and B. Roark. 2007. The Utility of Parse-derived Features for Automatic Discourse Segmentation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pages 488--495, Prague, Czech Republic. ACL.

[13]

S. Ghosh, R. Johansson, G. Riccardi, and S. Tonelli. 2011. Shallow Discourse Parsing with Conditional Random Fields. In Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 1071--1079, Chiang Mai, Thailand. AFNLP.

[14]

H. Hernault, H. Prendinger, D. duVerle, and M. Ishizuka. 2010. HILDA: A Discourse Parser Using Support Vector Machine Classification. Dialogue and Discourse, 1(3): 1--33.

[15]

D. Jurafsky and J. Martin, 2008. Speech and Language Processing, chapter 14. Prentice Hall.

Digital Library

[16]

A. Knott and R. Dale. 1994. Using Linguistic Phenomena to Motivate a Set of Coherence Relations. Discourse Processes, 18(1): 35--62.

[17]

J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning, pages 282--289, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

Digital Library

[18]

D. Magerman. 1995. Statistical Decision-tree Models for Parsing. In Proceedings of the 33rd annual meeting on Association for Computational Linguistics, pages 276--283, Cambridge, Massachusetts. ACL.

Digital Library

[19]

W. Mann and S. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text, 8(3): 243--281.

[20]

D. Marcu. 2000a. The Rhetorical Parsing of Unrestricted Texts: A Surface-based Approach. Computational Linguistics, 26: 395--448.

Digital Library

[21]

D. Marcu. 2000b. The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge, MA, USA.

Digital Library

[22]

M. Marcus, B. Santorini, and M. Marcinkiewicz. 1994. Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics, 19(2): 313--330.

Digital Library

[23]

A. McCallum. 2002. MALLET: A Machine Learning for Language Toolkit. http://mallet.cs.umass.edu.

[24]

K. Murphy. 2012. Machine Learning A Probabilistic Perspective (Forthcoming, August 2012). MIT Press, Cambridge, MA, USA.

Digital Library

[25]

R. Prasad, A. Joshi, N. Dinesh, A. Lee, E. Miltsakaki, and B. Webber. 2005. The Penn Discourse TreeBank as a Resource for Natural Language Generation. In Proceedings of the Corpus Linguistics Workshop on Using Corpora for Natural Language Generation, pages 25--32, Birmingham, U.K.

[26]

R. Prasad, N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, and B. Webber. 2008. The Penn Discourse TreeBank 2.0. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC), pages 2961--2968, Marrakech, Morocco. ELRA.

[27]

F. Schilder. 2002. Robust Discourse Parsing via Discourse Markers, Topicality and Position. Natural Language Engineering, 8(3): 235--255, June.

Digital Library

[28]

F. Sha and F. Pereira. 2003. Shallow Parsing with Conditional Random Fields. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, pages 134--141, Edmonton, Canada. ACL.

Digital Library

[29]

S. Somasundaran, 2010. Discourse-Level Relations for Opinion Analysis. PhD thesis, University of Pittsburgh.

Digital Library

[30]

R. Soricut and D. Marcu. 2003. Sentence Level Discourse Parsing Using Syntactic and Lexical Information. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, pages 149--156, Edmonton, Canada. ACL.

Digital Library

[31]

C. Sporleder and M. Lapata. 2005. Discourse Chunking and its Application to Sentence Compression. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 257--264, Vancouver, British Columbia, Canada. ACL.

Digital Library

[32]

M. Stede. 2011. Discourse Processing. Synthesis Lectures on Human Language Technologies. Morgan And Claypool Publishers, November.

[33]

R. Subba and B. Di Eugenio. 2009. An Effective Discourse Parser that Uses Rich Linguistic Information. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 566--574, Boulder, Colorado. ACL.

Digital Library

[34]

C. Sutton, A. McCallum, and K. Rohanimanesh. 2007. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. Journal of Machine Learning Research (JMLR), 8: 693--723.

Digital Library

[35]

S. Verberne, L. Boves, N. Oostdijk, and P. Coppen. 2007. Evaluating Discourse-based Answer Extraction for Why-question Answering. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 735--736, Amsterdam, The Netherlands. ACM.

Digital Library

[36]

M. Wainwright, T. Jaakkola, and A. Willsky. 2002. Tree-based Reparameterization for Approximate Inference on Loopy Graphs. In Advances in Neural Information Processing Systems 14, pages 1001--1008. MIT Press.

[37]

F. Wolf and E. Gibson. 2005. Representing Discourse Coherence: A Corpus-Based Study. Computational Linguistics, 31: 249--288, June.

Digital Library

Cited By

Li JSong YWei ZWong K(2018)A joint model of conversational discourse and latent topics on microblogsComputational Linguistics10.1162/coli_a_0033544:4(719-754)Online publication date: 1-Dec-2018
https://dl.acm.org/doi/10.1162/coli_a_00335

Recommendations

Sentence level discourse parsing using syntactic and lexical information
NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1

We introduce two probabilistic models that can be used to identify elementary discourse units and build sentence-level discourse parse trees. The models use syntactic and lexical features. A discourse parsing algorithm that implements these models ...
A Novel Sentence-Level Agreement Architecture for Neural Machine Translation
In neural machine translation (NMT), there is a natural correspondence between source and target sentences. The traditional NMT method does not explicitly model the translation agreement on sentence-level. In this article, we propose a comprehensive and ...
Sentential structure and discourse parsing
DiscAnnotation '04: Proceedings of the 2004 ACL Workshop on Discourse Annotation

In this paper, we describe how the LIDAS System (Linguistic Discourse Analysis System), a discourse parser built as an implementation of the Unified Linguistic Discourse Model (U-LDM) uses information from sentential syntax and semantics along with ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

EMNLP-CoNLL '12: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

July 2012

1573 pages

General Chair:
Jun'ichi Tsujii
Microsoft Research Asia
,
Program Chairs:
James Henderson
Xerox Research Centre Europe
,
Marius Pasca
Google

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 12 July 2012

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 73 of 234 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
269
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)6

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li JSong YWei ZWong K(2018)A joint model of conversational discourse and latent topics on microblogsComputational Linguistics10.1162/coli_a_0033544:4(719-754)Online publication date: 1-Dec-2018
https://dl.acm.org/doi/10.1162/coli_a_00335

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents