Abstract
This paper proposes a unified extractive approach based on affinity graph to both generic and topic-focused multi-document summarizations. By using an asymmetric similarity measure, the relationships between sentences are reflected in a directed affinity graph for generic summarization. For topic-focused summarization, the topic information is incorporated into the affinity graph using a topic-sensitive affinity measure. Based on the affinity graph, the information richness of sentences is computed by the graph-ranking algorithm on differentiated intra-document links and inter-document links between sentences. Lastly, the greedy algorithm is employed to impose diversity penalty on sentences and the sentences with both high information richness and high information novelty are chosen into the summary. Experimental results on the tasks of DUC 2002-2005 demonstrate the excellent performances of the proposed approaches to both generic and topic-focused multi-document summarization tasks.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrival. ACM Press and Addison Wesley (1999)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1–7 (1984)
Conroy, J.M., Schlesinger, J.D.: CLASSY query-based multi-document summarization. In: Proceedings of the 2005 Document Understanding Workshop (2005)
Daumé., H., Marcu, D.: Bayesian query-focused summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 305–312 (2006)
Erkan, G., Radev, D.: LexPageRank: prestige in multi-document text summarization. In: EMNLP 2004. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain (2004)
Farzindar, A., Rozon, F., Lapalme, G.: CATS a topic-oriented multi-document summarization system at DUC 2005. In: Proceedings of the 2005 Document Understanding Workshop (2005)
Fellbaum, C.: WordNet: An Electronic Lexical Database. The MIT Press, Cambridge (1998)
Ge, J., Huang, X., Wu, L.: Approaches to event-focused summarization based on named entities and query words. In: Proceedings of the 2003 Document Understanding Workshop (2003)
Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing Text Documents: Sentence Selection and Evaluation Metrics. In: Proceedings of ACM SIGIR-99, Berkeley, CA, pp. 121–128 (1999)
Harabagiu, S., Lacatusu, F.: Topic themes for multi-document summarization. In: Proceedings of SIGIR 2005, Salvador, Brazil, pp. 202–209 (2005)
Hardy, H., Shimizu, N., Strzalkowski, T., Ting, L., Wise, G.B., Zhang, X.: Cross-document summarization by concept classification. In: Proceedings of SIGIR 2002, Tampere, Finland (2002)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Lin, C.-Y., Hovy, E.H.: Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In: HLT-NAACL 2003. Proceedings of 2003 Language Technology Conference, Edmonton, Canada (2003)
Lin, C.-Y., Hovy, E.H.: From Single to Multi-document Summarization: A Prototype System and its Evaluation. In: Proceedings of ACL 2002, Philadelphia, PA, USA (July 7-12, 2002)
Mani, I., Bloedorn, E.: Summarizing Similarities and Differences Among Related Documents. Information Retrieval 1(1) (2000)
Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 19–24. Springer, Heidelberg (2005)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., et al.: The Mead multi-document summarizer (2003), http://www.summarization.com/mead/
Radev, D.R., Jing, H.Y., Stys, M., Tam, D.: Centroid-based summarization of multiple documents. Information Processing and Management 40, 919–938 (2004)
Saggion, H., Bontcheva, K., Cunningham, H.: Robust generic and query-based summarization. In: Proceedings of EACL-2003 (2003)
Tombros, A., van Rijsbergen, C.J.: Query-sensitive similarity measures for information retrieval. Knowledge and Information Systems 6(5), 617–642 (2004)
Zhang, B., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.-Y.: Improving web search results using affinity graph. In: Proceedings of SIGIR 2005, Salvador, Brazil (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wan, X., Xiao, J. (2007). Towards a Unified Approach Based on Affinity Graph to Various Multi-document Summarizations. In: Kovács, L., Fuhr, N., Meghini, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2007. Lecture Notes in Computer Science, vol 4675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74851-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-74851-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74850-2
Online ISBN: 978-3-540-74851-9
eBook Packages: Computer ScienceComputer Science (R0)