Abstract
This paper contains a large literature review in the research field of Text Summarisation (TS) based on Human Language Technologies (HLT). TS helps users manage the vast amount of information available, by condensing documents’ content and extracting the most relevant facts or topics included in them. The rapid development of emerging technologies poses new challenges to this research field, which still need to be solved. Therefore, it is essential to analyse its progress over the years, and provide an overview of the past, present and future directions, highlighting the main advances achieved and outlining remaining limitations. With this purpose, several important aspects are addressed within the scope of this survey. On the one hand, the paper aims at giving a general perspective on the state-of-the-art, describing the main concepts, as well as different summarisation approaches, and relevant international forums. Furthermore, it is important to stress upon the fact that the birth of new requirements and scenarios has led to new types of summaries with specific purposes (e.g. sentiment-based summaries), and novel domains within which TS has proven to be also suitable for (e.g. blogs). In addition, TS is successfully combined with a number of intelligent systems based on HLT (e.g. information retrieval, question answering, and text classification). On the other hand, a deep study of the evaluation of summaries is also conducted in this paper, where the existing methodologies and systems are explained, as well as new research that has emerged concerning the automatic evaluation of summaries’ quality. Finally, some thoughts about TS in general and its future will encourage the reader to think of novel approaches, applications and lines to conduct research in the next years. The analysis of these issues allows the reader to have a wide and useful background on the main important aspects of this research field.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agnihotri L, Kender JR, Dimitrova N, Zimmerman J (2005) User study for generating personalized summary profiles. In: Proceedings of the IEEE international conference on multimedia and expo (ICME). pp 1094–1097
Ahmet A, Gaizauskas R (2010) Generating image descriptions using dependency relational patterns. In: Proceedings of the 48th annual meeting of the association for computational linguistics
Aker A, Gaizauskas R (2009) Summary generation for toponym-referenced images using object type language models. In: Proceedings of the international conference on recent advances in natural language processing (RANLP-2009)
Aker A, Gaizauskas R (2010) Model summaries for OPTlocation-related images. In: Proceedings of language resources and evaluation
Amigó E, Gonzalo J, Peñas A, Verdejo F (2005) QARLA: a framework for the evaluation of text summarization systems. In: ACL ’05: proceedings of the 43rd annual meeting on association for computational linguistics. pp 280–289
Ando R, Boguraev B, Byrd R, Neff M (2005) Visualization-enabled multi-document summarization by Iterative Residual Rescaling. Nat Lang Eng 11(1): 67–86
Angheluta R, Busser RD, Francine Moens M (2002) The use of topic segmentation for automatic summarization. In: Proceedings of the ACL-2002 post-conference workshop on automatic summarization. pp 66–70
Aone C, Okurowski ME, Gorlinsky J (1998) Trainable, scalable summarization using robust NLP and machine learning. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics, vol 1. pp 62–66
Azzam S, Humphreys K, Gaizauskas R (1999) Using coreference chains for text summarization. In: Proceedings of the ACL’99 workshop on coreference and its applications
Balahur A, Montoyo A (2008) Multilingual feature-driven opinion extraction and summarization from customer reviews. In: Proceedings of 13th international conference on applications of natural language to information systems. pp 345–346
Balahur A, Lloret E, Ferrández O, Montoyo A, Palomar M, Muñoz R (2008) The DLSIUAES team’s participation in the TAC 2008 tracks. In: Proceedings of the text analysis conference (TAC)
Balahur A, Lloret E, Boldrini E, Montoyo A, Palomar M, Martinez-Barco P (2009) Summarizing threads in blogs using opinion polarity. In: Proceedings of the international workshop on events in emerging text types (eETTs). pp 5–13
Balahur-Dobrescu A, Kabadjov M, Steinberger J, Steinberger R, Montoyo A (2009) Summarizing opinions in blog threads. In: Proceedings of the Pacific Asia conference on language, information and computation conference. pp 606–613
Baldwin B, Morton TS (1998) Dynamic coreference-based summarization. In: Proceedings of the third conference on empirical methods in natural language processing (EMNLP-3)
Barzilay R, Elhadad M (1999) Using lexical chains for text summarization. In: Advances in automatic text summarization. pp 111–122
Barzilay R, Lapata M (2005) Modeling local coherence: an entity-based approach. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL’05). pp 141–148
Barzilay R, McKeown KR (2005) Sentence fusion for multidocument news summarization. Comput Linguist 31(3): 297–328
Beineke P, Hastie T, Manning C, Vaithyanathan S (2004) An exploration of sentiment summarization. In: Proceedings of the AAAI spring symposium on exploring attitude and affect in text: theories and applications
Bellemare S, Bergler S, Witte R (2008) ERSS at TAC 2008. In: Proceedings of the text analysis conference (TAC)
Belz A (2008) Automatic Generation of Weather Forecast Texts Using Comprehensive Probabilistic Generation-space Models. Nat Lang Eng 14(4): 431–455
Berkovsky S, Baldwin T, Zukerman I (2008) Aspect-based personalized text summarization. In: Proceedings of the 5th international conference on adaptive hypermedia and adaptive web-based systems. pp 267–270
Biadsy F, Hirschberg J, Filatova E (2008) An unsupervised approach to biography production using Wikipedia. In: Proceedings of ACL-08: HLT. pp 807–815
Boguraev BK, Neff MS (2000) Discourse segmentation in aid of document summarization. In: Proceedings of the 33rd Hawaii international conference on system sciences, vol 3. p 3004
Bossard A, Généreux M, Poibeau T (2008) Description of the LIPN systems at TAC 2008: summarizing information and opinions. In: Proceedings of the text analysis conference (TAC)
Branny E (2007) Automatic summary evaluation based on text grammars. J Digit Inf 8(3). http://journals.tdl.org/jodi/article/viewArticle/232
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on Machine learning. pp 89–96
Carenini G, Cheung JCK (2008) Extractive vs. NLG-based abstractive summarization of evaluative text: the effect of corpus controversiality. In: Proceedings of the fifth international natural language generation conference, ACL 2008. pp 33–40
Cesarano C, Mazzeo A, Picariello A (2007) A system for summary-document similarity in notary domain. International Workshop on Database Expert Syst Appl:254–258
Ceylan H, Mihalcea R (2009) The decomposition of human-written book summaries. In: Proceedings of the 10th international conference on computational linguistics and intelligent text processing (CICLing ’09). pp 582–593
Cole, R (ed) (1997) Survey of the state of the art in human language technology. Cambridge University Press, Cambridge
Conroy J, Schlesinger J (2008) CLASSY at TAC 2008 Metrics. In: Proceedings of the text analysis conference (TAC)
Conroy JM, Dang HT (2008) Mind the gap: dangers of divorcing evaluations of summary content from linguistic quality. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008). pp 145–152
Conroy JM, O’leary DP (2001) Text summarization via hidden Markov models. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. pp 406–407
Conroy JM, Schlesinger JD, O’Leary DP (2009) CLASSY 2009: summarization and metrics. In: Proceedings of the text analysis conference (TAC)
Cristea D, Postolache O, Pistol I (2005) Summarisation through discourse structure. In: Proceedings of the computational linguistics and intelligent text processing, 6th International conference (CICLing 2005). pp 632–644
Cunha ID, Fernández S, Velázquez-Morales P, Vivaldi J, SanJuan E, Moreno JMT (2007) A new hybrid summarizer based on vector space model, statistical physics and linguistics. In: MICAI 2007: advances in artificial intelligence. pp 872–882
Dang HT (2006) Overview of DUC 2006. In: The document understanding workshop (presented at the HLT/NAACL). Brooklyn, New York, USA
Demner-Fushman D, Lin J (2006) Answer extraction, semantic clustering, and extractive summarization for clinical question answering. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics. pp 841–848
Deschacht K, Moens MF (2007) Text analysis for automatic image annotation. In: Proceedings of the 45th annual meeting of the association of computational linguistics. pp 1000–1007
Díaz A, Gervás P (2007) User-model based Personalized Summarization. Inf Process Manag 43(6): 1715–1734
Donaway RL, Drummey KW, Mather LA (2000) A comparison of rankings produced by summarization evaluation measures. In: Proceedings of NAACL-ANLP 2000 workshop on automatic summarization. pp 69–78
Dunlavy DM, O’Leary DP, Conroy JM, Schlesinger JD (2007) QCS: A system for querying, clustering and summarizing documents. Inf Process Manag 43(6): 1588–1605
Edmundson HP (1969) New methods in automatic extracting. In: Mani I, Maybury M (eds) Advances in automatic text summarization. pp 23–42
Elsner M, Charniak E (2008) Coreference-inspired coherence modeling. In: Proceedings of ACL-08: HLT, short papers. pp 41–44
Ercan G, Cicekli I (2008) Lexical cohesion based topic modeling for summarization. In: Proceedings of the 9th international conference in computational linguistics and intelligent text processing. pp 582–592
Erkan G, Radev DR (2004) LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. J Artif Intell Res (JAIR) 22: 457–479
Fan J, Gao Y, Luo H, Keim DA, Li Z (2008) A novel approach to enable semantic and visual image summarization for exploratory image search. In: MIR ’08: proceeding of the 1st ACM international conference on multimedia information retrieval. pp 358–365
Fellbaum C (1998) WordNet: an electronical lexical database. The MIT Press, Cambridge
Feng Y, Lapata M (2008) Automatic image annotation using auxiliary text information. In: Proceedings of ACL-08: HLT. pp 272–280
Filatova E, Hatzivassiloglou V (2004) Event-based extractive summarization. In: Marie-Francine Moens SS (ed) Text summarization branches out: proceedings of the ACL-04 workshop. pp 104–111
Fisher S, Dunlop A, Roark B, Chen Y, Burmeister J (2009) OHSU summarization and entity linking systems. In: Proceedings of the text analysis conference (TAC)
Fiszman M, Rindflesch TC, Kilicoglu H (2004) Abstraction summarization for managing the biomedical research literature. In: Moldovan D, Girju R (eds) HLT-NAACL 2004: workshop on computational lexical semantics. pp 76–83
Fuentes M, González E, Ferrés D, Rodríguez H (2005) QASUM-TALP at DUC 2005 automatically evaluated with a pyramid based metric. In: The document understanding workshop (presented at the HLT/EMNLP annual meeting)
Fuentes M, Alfonseca E, Rodríguez H (2007) Support vector machines for query-focused summarization trained and evaluated on pyramid data. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions. pp 57–60
Fukushima T, Okumura M (2001) Text summarization challenge: text summarization evaluation at NTCIR workshop 2. In: Proceedings of the second NTCIR workshop meeting on evaluation of chinese and japanese text retrieval and text summarization. pp 9–13
Giannakopoulos G, Karkaletsis V (2009) N-GRAM GRAPHS: representing documents and document sets in summary system evaluation. In: Proceedings of the text analysis conference (TAC)
Giannakopoulos G, Karkaletsis V, Vouros G (2008a) Testing the use of n-gram graphs in summarization sub-tasks. In: Proceedings of the text analysis conference (TAC)
Giannakopoulos G, Karkaletsis V, Vouros G, Stamatopoulos P (2008) Summarization System Evaluation Revisited: N-gram graphs. ACM Trans Speech Lang Process 5(3): 1–39
Goldstein J, Mittal V, Carbonell J, Kantrowitz M (2000) Multi-document summarization by sentence extraction. In: NAACL-ANLP 2000 workshop on automatic Summarization. pp. 40–48
Gonçalves PN, Rino L, Vieira R (2008) Summarizing and referring: towards cohesive extracts. In: DocEng ’08: proceeding of the eighth ACM symposium on document engineering. pp 253–256
Gotti F, Lapalme G, Nerima L, Wehrli E (2007) GOFAISUM: a symbolic summarizer for DUC. In: The document understanding workshop (presented at the HLT/NAACL)
Grosz BJ, Weinstein S, Joshi AK (1995) Centering: A Framework for Modeling the Local Coherence of Discourse. Comput Linguist 21(2): 203–225
Harabagiu S, Lacatusu F (2005) Topic themes for multi-document summarization. In: SIGIR ’05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. pp 202–209
Hasler L (2007) From extracts to abstracts: human summary production operations for computer-aided summarisation. In: Proceedings of the RANLP 2007 workshop on computer-aided language processing (CALP). pp 11–18
Hasler L (2008) Centering theory for evaluation of coherence in computer-aided summaries. In: (ELRA) ELRA (ed) Proceedings of the sixth international conference on language resources and evaluation (LREC’08)
Hassel M (2007) Resource lean and portable automatic text summarization. PhD thesis, Department of Numerical Analysis and Computer Science, Royal Institute of Technology
He L, Sanocki E, Gupta A, Grudin J (1999) Auto-summarization of audio-video presentations. In: MULTIMEDIA ’99: proceedings of the seventh ACM international conference on multimedia (Part 1). pp 489–498
He T, Chen J, Gui Z, Li F (2008) CCNU at TAC 2008: proceeding on using semantic method for automated summarization. In: Proceedings of the text analysis conference (TAC)
Hearst MA (1997) TextTiling: segmenting text into multi-paragraph subtopic passages. Comput Linguist 23(1): 33–64
Hirao T, Okumura M, Fukusima T, Nanba H (2005) Text summarization challenge 3—text summarization evaluation at NTCIR workshop 4. In: Proceedings of the fourth NTCIR workshop on research in information access technologies information retrieval, question answering and summarization. pp 407–411
Hovy E, Lin CY (1999) Automated multilingual text summarization and its evaluation. Technical report Information Sciences Institute, University of Southern California
Hovy E, Lin CY, Zhou L, Fukumoto J (2006) Automated summarization evaluation with basic elements. In: Proceedings of the 5th international conference on language resources and evaluation (LREC)
Jaoua M, Hamadou AB (2003) Automatic text summarization of scientific articles based on classification of extract’s Population. In: Proceedings of computational linguistics and intelligent text processing, 4th international conference. pp 623–634
Jing H (2002) Using hidden Markov modeling to decompose human-written summaries. Comput Linguist 28(4): 527–543
Jing H, McKeown KR (2000) Cut and paste based text summarization. In: Proceedings of the 1st North American chapter of the association for computational linguistics Conference. pp 178–185
Kaisser M, Hearst MA, Lowe JB (2008) Improving search results quality by customizing summary lengths. In: Proceedings of ACL-08: HLT. pp 701–709
Kan MY, Klavans JL (2002) Using librarian techniques in automatic text summarization for information retrieval. In: JCDL ’02: proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries. pp 36–45
Kan MY, Klavans JL, Mckeown KR (2002) Using the annotated bibliography as a resource for indicative summarization. In: Proceedings of the language resources and evaluation conference. pp 1746–1752
Katragadda R (2010) GEMS: generative modeling for evaluation of summaries. In: Proceedings of the 11th international conference on computational linguistics and intelligent text processing, CICLing. pp 724–735
Kazantseva A (2006) An approach to summarizing short stories. In: Proceedings of the student research workshop at the 11th conference of the European chapter of the association for computational linguistics. pp 55–62
Ker SJ, Chen JN (2000) A text categorization based on summarization technique. In: Proceedings of the ACL-2000 workshop on recent advances in natural language processing and information retrieval. pp 79–83
Khan AU, Khan S, Mahmood W (2005) MRST: a new technique for information summarization. In: The second world enformatika conference, WEC’05. pp 249–252
Kumar C, Pingali P, Varma V (2008) Generating personalized summaries using publicly available web documents. In: Proceedings of the 2008 IEEE/WIC/ACM international conference on web intelligence and international conference on intelligent agent technology. pp 103–106
Kumar M, Das D, Agarwal S, Rudnicky A (2009) Non-textual event summarization by applying machine learning to template-based language generation. In: Proceedings of the 2009 workshop on language generation and summarisation (UCNLG + Sum 2009). pp 67–71
Kuo JJ, Chen HH (2008) Multidocument Summary Generation: Using Informative and Event Words. ACM Trans Asian Lang Inf Process (TALIP) 7(1): 1–23
Kupiec J, Pedersen J, Chen F (1995) A trainable document summarizer. In: SIGIR ’95: proceedings of the 18th annual international ACM SIGIR conference on research and development in information retrieval. pp 68–73
Lapata M, Barzilay R (2005) Automatic evaluation of text coherence: models and representations. In: Proceedings of the 19th international joint conference on artificial intelligence. pp 1085–1090
Lerman K, McDonald R (2009) Contrastive summarization: an experiment with consumer reviews. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the association for computational linguistics, companion volume: short papers. pp 113–116
Lerman K, Blair-Goldensohn S, McDonald R (2009) Sentiment summarization: evaluating and learning user preferences. In: Proceedings of the 12th conference of the European chapter of the ACL (EACL 2009). pp 514–522
Li S, Ouyang Y, Wang W, Sun B (2007) Multi-document summarization using support vector regression. In: The document understanding workshop (presented at the HLT/NAACL). Rochester, New York USA
Li S, Wan W, Wang C (2008) TAC 2008 update summarization task of ICL. In: Proceedings of the text analysis conference (TAC)
Li S, Wang W, Zhang Y (2009) Tac 2009 update summarization of icl. In: Proceedings of the text analysis conference (TAC)
Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of ACL text summarization workshop. pp 74–81
Lin CY, Hovy E (2000) The automated acquisition of topic signatures for text summarization. In: Proceedings of the 18th conference on computational linguistics. pp 495–501
Liu F, Liu Y (2008) Correlation between ROUGE and human evaluation of extractive meeting summaries. In: Proceedings of ACL-08: HLT, short papers. pp 201–204
Liu M, Yu B, Fang F, Sun H (2009) TAC 2009 update summarization task of WUST. In: Proceedings of the text analysis conference (TAC)
Lloret E, Palomar M (2009) A gradual combination of features for building automatic summarisation systems. In: Proceedings of the 12th international conference on text, speech and dialogue (TSD). pp 16–23
Lloret E, Ferrández O, Muñoz R, Palomar M (2008) A text summarization approach under the influence of textual entailment. In: Proceedings of the 5th international workshop on natural language processing and cognitive science (NLPCS 2008). pp 22–31
Lloret E, Balahur A, Palomar M, Montoyo A (2009) Towards building a competitive opinion summarization system: challenges and keys. In: Proceedings of the NAACL. Student Research Workshop and Doctoral Consortium. pp 72–77
Lloret E, Saggion H, Palomar M (2010) Experiments on summary-based opinion classification. In: Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text. pp 107–115
Luhn HP (1958) The automatic creation of literature abstracts. In: Advances in automatic text summarization. pp 15–22
Mani I (2001) Automatic summarization. John Benjamins Publishing Co. Amsterdam, Philadelphia, USA
Mani I (2001b) Summarization evaluation: an overview. In: Proceedings of the North American chapter of the association for computational linguistics (NAACL). Workshop on Automatic Summarization
Mani I, Maybury MT (1999) Advances in automatic text summarization. The MIT Press, Cambridge
Mani I, House D, Klein G, Hirschman L, Firmin T, Sundheim B (1999) The TIPSTER SUMMAC text summarization evaluation. In: Proceedings of the ninth conference on European chapter of the association for computational linguistics. pp 77–85
Mani I, Klein G, House D, Hirschman L, Firmin T, Sundheim B (2002) SUMMAC: a text summarization evaluation. Nat Lang Eng 8(1): 43–68
Mann WC, Thompson SA (1988) Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3): 243–281
Manning CD, Raghavan P, Schtze H (2008) Introduction to information retrieval. Cambridge University Press, New York, NY, USA
Marcu D (1999) Discourse trees are good indicators of importance in text. In: Advances in automatic text summarization. pp 123–136
McCargar V (2005) Statistical Approaches to Automatic Text Summarization. Bull Am Soc Inf Sci Technol 30(4): 21–25
Medelyan O (2007) Computing lexical chains with graph clustering. In: Proceedings of the ACL 2007 student research workshop. pp 85–90
Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions. p 20
Mihalcea R, Ceylan H (2007) Explorations in automatic book summarization. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). pp 380–389
Mille S, Wanner L (2008) Multilingual summarization in practice: the case of patent claims. In: Proceedings of the 12th European association of machine translation conference. pp 120–129
Minel JL, Nugier S, Piat G (1997) How to appreciate the quality of automatic text summarization? Examples of FAN and MLUCE protocols and their results on SERAPHIN. In: Proceedings of intelligent scalable text summarization workshop in conjunction with the European chapter of the association of computational linguistics (EACL). pp 25–30
Mitkov R, Evans R, Orasan C, Ha LA, Pekar V (2007) Anaphora resolution: to what extent does it help NLP applications? In: Proceedings of the 6th discourse anaphora and anaphor resolution colloquium. pp 179–190
Mohammad S, Dorr B, Egan M, Hassan A, Muthukrishan P, Qazvinian V, Radev D, Zajic D (2009) Using citations to generate surveys of scientific paradigms. In: Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics. pp 584–592
Mori T (2002) Information gain ratio as term weight: the case of summarization of IR results. In: Proceedings of the 19th international conference on computational linguistics. pp 1–7
Mori T, Nozawa M, Asada Y (2004) Multi-answer-focused multi-document summarization using a question-answering engine. In: COLING ’04: proceedings of the 20th international conference on computational linguistics. pp 439–445
Mori T, Nozawa M, Asada Y (2005) Multi-answer-focused multi-document summarization using a question-answering engine. ACM Trans Asian Lang Inf Process (TALIP) 4(3): 305–320
Morris AH, Kasper GM, Adams DA (1992) The Effect and Limitations of Automatic Text Condensing on Reading Comprehension Performance. Inf Syst Res 3(1): 17–35
Nastase V, Milne D, Filippova K (2009) Summarizing with encyclopedic knowledge. In: Proceedings of the text analysis conference (TAC)
Nenkova A (2005) Automatic text summarization of newswire: lessons learned from the document understanding conference. In: Proceedings of the American association fro artificial intelligence (AAAI). pp 1436–1441
Nenkova A (2006) Summarization evaluation for text and speech: issues and Approaches. In: INTERSPEECH-2006, paper 2079-Wed1WeS.1
Nenkova A, Siddharthan A, McKeown K (2005) Automatically learning cognitive status for multi-document summarization of newswire. In: HLT ’05: proceedings of the conference on human language technology and empirical methods in natural language processing. pp 241–248
Neto JL, Santos A, Kaestner CAA, Freitas AA (2000) Generating text summaries through the relative importance of topics. In: IBERAMIA-SBIA ’00: proceedings of the international joint conference, 7th Ibero-American conference on AI. pp 300–309
Okumura M, Fukusima T, Nanba H, Hirao T (2004) Text Summarization Challenge 2 text summarization evaluation at NTCIR workshop 3. SIGIR Forum 38(1): 29–38
Orăsan C (2004) The influence of personal pronouns for automatic summarisation of scientific articles. In: Proceedings of the discourse anaphora and anaphor resolution colloquium. pp 127–132
Orăsan C (2007) Pronominal anaphora resolution for text summarisation. In: Proceedings of the recent advances on natural language processing. pp 430–436
Orăsan C (2009) Comparative Evaluation of Term-Weighting Methods for Automatic Summarization. J Quant Linguist 16(1): 67–95
Orăsan C, Pekar V, Hasler L (2004) A comparison of summarisation methods based on term specificity estimation. In: Proceedings of the fourth international conference on language resources and evaluation (LREC2004). pp 1037–1041. Available at:http://clg.wlv.ac.uk/papers/orasan-04a.pdf
Over P, Ligget W (2002) Introduction to DUC: an intrinsic evaluation of generic news text summarization systems. In: The document understanding workshop
Over P, Dang H, Harman D (2007) DUC in Context. Inf Process Manag 43(6): 1506–1520
Owczarzak K (2009) DEPEVAL(summ): dependency-based evaluation for automatic summaries. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 190–198
Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the association of computational linguistics. pp 115–124
Pang B, Lee L (2008) Opinion Mining and Sentiment Analysis. Found Trends Inf Retr 2(1–2): 1–135
Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of 40th annual meeting of the association for computational linguistics. pp 311–318
Passoneau RJ (2010) Formal and Functional Assessment of the Pyramid Method for Summary Content Evaluation. Nat Lang Eng 16(2): 107–131
Pitler E, Nenkova A (2008) Revisiting readability: a unified framework for predicting text quality. In: Proceedings of the 2008 conference on empirical methods in natural language processing. pp 186–195
Plaza L, Díaz A, Gervás P (2008) Concept-graph based biomedical automatic Summarization Using Ontologies. In: Coling 2008: Proceedings of the 3rd textgraphs workshop on graph-based algorithms for natural language processing. pp 53–56
Plaza L, Lloret E, Aker A (2010) Improving automatic image captioning using text summarization techniques. In: Proceedings of the 13th international conference on text, speech and dialogue (TSD)
Qazvinian V, Radev DR (2008) Scientific paper summarization using citation summary networks. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008). pp 689–696
Radev DR, Fan W (2000) Automatic summarization of search engine hit lists. In: Proceedings of the ACL-2000 workshop on recent advances in natural language processing and information retrieval. pp 99–109
Radev DR, McKeown KR (1998) Generating Natural Language Summaries from Multiple on-line Sources. Comput Linguist 24(3): 470–500
Radev DR, Tam D (2003) Summarization evaluation using relative utility. In: CIKM ’03: proceedings of the 12th international conference on information and knowledge management. pp 508–511
Radev DR, Blair-Goldensohn S, Zhang Z (2001) Experiments in single and multi-document summarization using MEAD. In: First document understanding conference. pp 1–7
Radev DR, Hovy E, McKeown K (2002) Introduction to the Special Issue on Summarization. Comput Linguist 28(4): 399–408
Saggion H (2008) Automatic summarization: an overview. Revue franaise de linguistique appliquée XIII(1). pp 63–81
Saggion H (2009) A classification algorithm for predicting the structure of summaries. In: Proceedings of the 2009 workshop on language generation and summarisation (UCNLG+Sum 2009). pp 31–38
Saggion H, Funk A (2009) Extracting Opinions and Facts for Business Intelligence. RNTI E-17: 119–146
Saggion H, Lapalme G (2000) Selective analysis for automatic abstracting: evaluating indicativeness and acceptability. In: Proceedings of content-based multimedia information access (RIAO). pp 747–764
Saggion H, Lloret E, Palomar M (2010) Using text summaries for predicting rating scales. In: Proceedings of the 1st workshop on computational approaches to subjectivity and sentiment analysis (WASSA)
Sakai T, Sparck-Jones K (2001) Generic summaries for indexing in information retrieval. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. pp 190–198
Saravanan M, Ravindran B, Raman S (2006) Improving legal document summarization using graphical models. In: Proceedings of legal knowledge and information systems—JURIX 2006: the 19th annual conference on legal knowledge and information systems. pp 51–60
Sauper C, Barzilay R (2009) Automatically generating wikipedia articles: a structure-aware approach. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP. pp 208–216
Schilder F, Kondadadi R (2008) FastSum: fast and accurate query-based multi-document summarization. In: Proceedings of ACL-08: HLT, short papers. pp 205–208
Schilder F, Kondadadi R, Leidner JL, Conrad JG (2008) Thomson reuters at TAC 2008: aggressive filtering with FastSum for update and opinion summarization. In: Proceedings of the text analysis conference (TAC)
Schlesinger JD, Okurowski ME, Conroy JM, O’Leary DP, Taylor A, Hobbs J, Wilson H (2002) Understanding machine performance in the context of human performance for multi-document summarization. In: Proceedings of the DUC 2002 workshop on text summarization
Sebastiani F (2002) Machine Learning in Automated Text Categorization. ACM Comput Surv 34(1): 1–47
Shen D, Chen Z, Yang Q, Zeng HJ, Zhang B, Lu Y, Ma WY (2004) Web-page classification through summarization. In: SIGIR ’04: proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. pp 242–249
Shen D, Yang Q, Chen Z (2007) Noise Reduction through Summarization for Web-page Classification. Inf Process Manag 43(6): 1735–1747
Shi Z, Melli G, Wang Y, Liu Y, Gu B, Kashani MM, Sarkar A, Popowich F (2007) Question answering summarization of multiple biomedical documents. In: CAI ’07: proceedings of the 20th conference of the Canadian society for computational studies of intelligence on advances in artificial intelligence. pp 284–295
Sjöbergh J (2007) Older Versions of the ROUGEeval Summarization Evaluation System were Easier to Fool. Inf Process Manag 43(6): 1500–1505
Spärck Jones K (1999) Automatic summarizing: factors and directions. In: Advances in automatic text summarization. pp 1–14
Spärck Jones K (2007) Automatic Summarising: The State of the Art. Inf Process Manag 43(6): 1449–1481
Spärck-Jones K, Galliers JR (eds) (1996) Evaluating natural language processing systems, an analysis and review, lecture notes in computer science, vol 1083. Springer, Berlin
Steinberger J, Poesio M, Kabadjov MA, Ježek K (2007) Two Uses of Anaphora Resolution in Summarization. Inf Process Manag 43(6): 1663–1680
Steinberger J, Jezek K, Sloup M (2008) Web topic summarization. In: Proceedings of the 12th international conference on electronic publishing. pp 322–334
Strzalkowski T, Harabagiu S (2007) Advances in open domain question answering (Text, Speech and Language Technology). Springer-Verlag New York, Inc., Secaucus, NJ, USA
Sun JT, Shen D, Zeng HJ, Yang Q, Lu Y, Chen Z (2005) Web-page summarization using clickthrough data. In: SIGIR ’05: proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. pp 194–201
Svore KM, Vanderwende L, Burges CJ (2007) Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). pp 448–457
Sweeney S, Crestani F, Losada DE (2008) Show me more: Incremental length summarisation using novelty detection. Inf Process Manag 44(2): 663–686
Szlávik Z, Tombros A, Lalmas M (2006) Investigating the use of summarisation for interactive XML retrieval. In: SAC ’06: Proceedings of the 2006 ACM symposium on applied computing. pp 1068–1072
Teng Z, Liu Y, Ren F, Tsuchiya S, Ren F (2008) Single document summarization based on local topic identification and word frequency. In: MICAI ’08: proceedings of the 2008 seventh Mexican international conference on artificial intelligence. pp 37–41. http://dx.doi.org/10.1109/MICAI.2008.12
Teufel S, Halteren Hv (2004) Evaluating information content by factoid analysis: human annotation and stability. In: Proceedings of the 2004 conference on empirical methods in natural language processing. pp 419–426
Teufel S, Moens M (2002) Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status.. Comput Linguist 28(4): 409–445
Titov I, McDonald R (2008) A joint model of text and aspect ratings for sentiment summarization. In: Proceedings of ACL-08: HLT. pp 308–316
Torres-Moreno JM, St-Onge PL, Gagnon M, El-Bze M, Bellot P (2009) Automatic summarization system coupled with a question-answering system (QAAS). NLP News Computing Language. http://arxiv.org/abs/0905.2990v1
Trappey A, Trappey C, Wu CY (2009) Automatic patent Document Summarization for Collaborative Knowledge Systems and Services. J Syst Sci Syst Eng 18(1): 71–94
Trappey AJC, Trappey CV (2008) An R&D Knowledge Management Method for Patent Document Summarization. Ind Manag Data Syst 108(2): 245–257
Tseng YH, Lin CJ, Lin YI (2007) Text Mining Techniques for Patent Analysis. Inf Process Manag 43(5): 1216–1247
Vadlapudi R, Katragadda R (2010a) On automated evaluation of readability of summaries: capturing grammaticality, focus, structure and coherence. In: Proceedings of the NAACL HLT 2010 student research workshop. pp 7–12
Vadlapudi R, Katragadda R (2010b) Quantitative evaluation of grammaticality of summaries. In: Proceedings of the 11th international conference on computational linguistics and intelligent text processing, CICLing. pp 736–747
Van Dijk TA (1972) Some aspects of text grammars. A study in Theoretical Linguistics and Poetics, La Haya-parís, Mouton
Van Rijsbergen CJ (1981) Information retrieval. Elsevier, Amsterdam
Wan X, Yang J, Xiao J (2007) Towards a unified approach based on affinity graph to various multi-document summarizations. In: Proceedings of the 11th European conference. pp 297–308
Wang C, Long L, Li L (2008) HowNet based evaluation for Chinese text summarization. In: Proceedings of the international conference on natural language processing and software engineering. pp 82–87
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the empirical methods in natural language processing. pp 347–354
Witte R, Krestel R, Bergler S (2007) Generating update summaries for DUC 2007. In: The document understanding workshop (presented at the HLT/NAACL)
Wong KF, Wu M, Li W (2008) Extractive summarization using supervised and semi-supervised learning. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008). pp 985–992
Yu J, Reiter E, Hunter J, Mellish C (2007) Choosing the Content of Textual Summaries of Large Time-series Data Sets. Nat Lang Eng 13(1): 25–49
Zajic D, Dorr BJ, Lin J, Schwartz R (2007) Multi-candidate reduction: Sentence compression as a tool for document summarization tasks. Inf Process Manag 43(6): 1549–1570
Zajic DM, Dorr BJ, Lin J (2008) Single-document and multi-document summarization techniques for email threads using sentence compression. Inf Process Manag 44(4): 1600–1610
Zechner K, Waibel A (2000) DiaSumm: flexible summarization of spontaneous dialogues in unrestricted domains. In: Proceedings of the 18th conference on computational linguistics. pp 968–974
Zhou L, Ticrea M, Hovy E (2004) Multi-document biography summarization. In: Proceedings of the conference on empirical methods in natural language processing. pp 434–441
Zhou L, Lin CY, Munteanu DS, Hovy E (2006) ParaEval: using paraphrases to evaluate summaries automatically. In: Proceedings of the human language technology/North American association of computational linguistics conference. pp 447–454
Zhuang L, Jing F, Zhu XY (2006) Movie review mining and summarization. In: CIKM ’06: proceedings of the 15th ACM international conference on information and knowledge management. pp 43–50
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lloret, E., Palomar, M. Text summarisation in progress: a literature review. Artif Intell Rev 37, 1–41 (2012). https://doi.org/10.1007/s10462-011-9216-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-011-9216-z