[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Towards a Unified Approach Based on Affinity Graph to Various Multi-document Summarizations

  • Conference paper
Research and Advanced Technology for Digital Libraries (ECDL 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4675))

Included in the following conference series:

Abstract

This paper proposes a unified extractive approach based on affinity graph to both generic and topic-focused multi-document summarizations. By using an asymmetric similarity measure, the relationships between sentences are reflected in a directed affinity graph for generic summarization. For topic-focused summarization, the topic information is incorporated into the affinity graph using a topic-sensitive affinity measure. Based on the affinity graph, the information richness of sentences is computed by the graph-ranking algorithm on differentiated intra-document links and inter-document links between sentences. Lastly, the greedy algorithm is employed to impose diversity penalty on sentences and the sentences with both high information richness and high information novelty are chosen into the summary. Experimental results on the tasks of DUC 2002-2005 demonstrate the excellent performances of the proposed approaches to both generic and topic-focused multi-document summarization tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrival. ACM Press and Addison Wesley (1999)

    Google Scholar 

  2. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1–7 (1984)

    Google Scholar 

  3. Conroy, J.M., Schlesinger, J.D.: CLASSY query-based multi-document summarization. In: Proceedings of the 2005 Document Understanding Workshop (2005)

    Google Scholar 

  4. Daumé., H., Marcu, D.: Bayesian query-focused summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 305–312 (2006)

    Google Scholar 

  5. Erkan, G., Radev, D.: LexPageRank: prestige in multi-document text summarization. In: EMNLP 2004. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain (2004)

    Google Scholar 

  6. Farzindar, A., Rozon, F., Lapalme, G.: CATS a topic-oriented multi-document summarization system at DUC 2005. In: Proceedings of the 2005 Document Understanding Workshop (2005)

    Google Scholar 

  7. Fellbaum, C.: WordNet: An Electronic Lexical Database. The MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  8. Ge, J., Huang, X., Wu, L.: Approaches to event-focused summarization based on named entities and query words. In: Proceedings of the 2003 Document Understanding Workshop (2003)

    Google Scholar 

  9. Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing Text Documents: Sentence Selection and Evaluation Metrics. In: Proceedings of ACM SIGIR-99, Berkeley, CA, pp. 121–128 (1999)

    Google Scholar 

  10. Harabagiu, S., Lacatusu, F.: Topic themes for multi-document summarization. In: Proceedings of SIGIR 2005, Salvador, Brazil, pp. 202–209 (2005)

    Google Scholar 

  11. Hardy, H., Shimizu, N., Strzalkowski, T., Ting, L., Wise, G.B., Zhang, X.: Cross-document summarization by concept classification. In: Proceedings of SIGIR 2002, Tampere, Finland (2002)

    Google Scholar 

  12. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  13. Lin, C.-Y., Hovy, E.H.: Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics. In: HLT-NAACL 2003. Proceedings of 2003 Language Technology Conference, Edmonton, Canada (2003)

    Google Scholar 

  14. Lin, C.-Y., Hovy, E.H.: From Single to Multi-document Summarization: A Prototype System and its Evaluation. In: Proceedings of ACL 2002, Philadelphia, PA, USA (July 7-12, 2002)

    Google Scholar 

  15. Mani, I., Bloedorn, E.: Summarizing Similarities and Differences Among Related Documents. Information Retrieval 1(1) (2000)

    Google Scholar 

  16. Mihalcea, R., Tarau, P.: A language independent algorithm for single and multiple document summarization. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 19–24. Springer, Heidelberg (2005)

    Google Scholar 

  17. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Google Scholar 

  18. Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., et al.: The Mead multi-document summarizer (2003), http://www.summarization.com/mead/

  19. Radev, D.R., Jing, H.Y., Stys, M., Tam, D.: Centroid-based summarization of multiple documents. Information Processing and Management 40, 919–938 (2004)

    Article  MATH  Google Scholar 

  20. Saggion, H., Bontcheva, K., Cunningham, H.: Robust generic and query-based summarization. In: Proceedings of EACL-2003 (2003)

    Google Scholar 

  21. Tombros, A., van Rijsbergen, C.J.: Query-sensitive similarity measures for information retrieval. Knowledge and Information Systems 6(5), 617–642 (2004)

    Article  Google Scholar 

  22. Zhang, B., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.-Y.: Improving web search results using affinity graph. In: Proceedings of SIGIR 2005, Salvador, Brazil (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

László Kovács Norbert Fuhr Carlo Meghini

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wan, X., Xiao, J. (2007). Towards a Unified Approach Based on Affinity Graph to Various Multi-document Summarizations. In: Kovács, L., Fuhr, N., Meghini, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2007. Lecture Notes in Computer Science, vol 4675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74851-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74851-9_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74850-2

  • Online ISBN: 978-3-540-74851-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics