Abstract
In this paper we present a first-ever manually-built Chinese sentence compression corpus. Based on this corpus, we develop a Chinese sentence compression system and study various measures for evaluation of Chinese sentence compression. We find that 1) using multi-references is very helpful for automatic evaluation in Chinese sentence compression; and 2) besides relational F1, some machine translation evaluation measures are correlated well with human judgments and thus are very promising for future use in this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bikel, D.M.: Intricacies of Collins’ Parsing Model. Computational Linguistics 30(4), 479–511 (2004)
Clarke, J., Lapata, M.: Global inference for sentence compression: An integer linear programming approach. Journal of Artificial Intelligence Research 1(31), 399–429 (2008)
Clarke, J., Lapata, M.: Models for sentence compression: A comparison across domains, training requirements and evaluation measures. In: Proceedings of ACL-COLING, pp. 377–384 (2006b)
Cohn, T., Lapata, M.: Sentence compression beyond word deletion. In: Proceedings of the 22nd COLING, pp. 137–144 (2009)
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceeding of HLT 2002 Proceedings of the Second International Conference on Human Language Technology Research, pp. 138–145 (2002)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin 76(5), 378–382 (1971)
Galley, M., McKeown, K.R.: Lexicalized Markov Grammars for Sentence Compression. In: Proceedings of HLT-NAACL, pp. 180–187 (2007)
Jing, H.: Sentence Reduction for automatic summarization. In: Proceedings of ANLP, pp. 310–315 (2000)
Knight, K., Marcu, D.: Statistical-based summarization-step one: sentence compression. In: Proceedings of AAAI 2000, pp. 703–710 (2000)
Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artificial Intelligence 139(1), 91–107 (2002)
McDonald, R.: Discriminative sentence compression with soft syntactic constraints. In: Proceedings of EACL, pp. 297–304 (2006)
Nießen, S., Och, F.J., Leusch, G., Ney, H.: An evaluation tool for machine translation: Fast evaluation for machine translation research. In: Proceedings of the Second International Conference on Language Resources and Evaluation (LREC), pp. 39–45 (2000)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a Method for Automatic Evaluation of Machine Translation. In: Proceedings of 40th Annual Meeting of ACL, pp. 311–318 (2002)
Riezler, S., King, T.H., Crouch, R., Zaenen, A.: Statistical sentence condensation using ambiguity packing and stochastic disambiguation methods for lexical-functional grammar. In: Proceedings of HLT-NAACL, pp. 118–125 (2003)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A Study of Translation Edit Rate with Targeted Human Annotation. In: Proceedings of Association for Machine Translation in the Americas, pp. 223–231 (2006)
Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., Sawaf, H.: Accelerated DP based search for statistical translation. In: Proceedings of European Conference on Speech Communication and Technology, pp. 2667–2670 (1997)
Turian, J.P., Shen, L., Melamed, I.D.: Evaluation of Machine Translation and its Evaluation. In: Proceedings of MT Summit IX, pp. 386–393 (2003)
Turner, J.P., Charniak, E.: Supervised and unsupervised learning for sentence compression. In: Proceedings of 43rd Annual Meeting of ACL, pp. 290–297 (2005)
Xue, N., Xia, F., Chiou, F., Palmer, M.: The Penn Chinese TreeBank: Phrase structure annotation of a large corpus. Natural Language Engineering 11(2), 207–238 (2005)
Yamangil, E., Nelken, R.: Mining Wikipedia revision histories for improving sentence compression. In: Proceedings of 46th Annual Meeting of ACL, pp. 137–140 (2008)
Yamangil, E., Shieber, S.M.: Bayesian synchronous tree-substitution grammar induction and its application to sentence compression. In: Proceedings of 48th Annual Meeting of ACL, pp. 934–947 (2010)
Yoshikawa, K., Iida, R., Hirao, T., Okumura, M.: Sentence Compression with Semantic Role Constraints. In: Proceedings of 50th Annual Meeting of ACL, pp. 349–353 (2012)
Zhang, Y., Clark, S.: Syntactic Processing Using the Generalized Perceptron and Beam Search. Computational Linguistics 37(1), 105–151 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, C., Hu, M., Xiao, T., Jiang, X., Shi, L., Zhu, J. (2013). Chinese Sentence Compression: Corpus and Evaluation. In: Sun, M., Zhang, M., Lin, D., Wang, H. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2013 2013. Lecture Notes in Computer Science(), vol 8202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41491-6_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-41491-6_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41490-9
Online ISBN: 978-3-642-41491-6
eBook Packages: Computer ScienceComputer Science (R0)