Article

Free access

An empirical study of information synthesis tasks

Authors:

Felisa VerdejoAuthors Info & Claims

ACL '04: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

Pages 207 - es

https://doi.org/10.3115/1218955.1218982

Published: 21 July 2004 Publication History

PDF eReader

Abstract

This paper describes an empirical study of the "Information Synthesis" task, defined as the process of (given a complex information need) extracting, organizing and inter-relating the pieces of information contained in a set of relevant documents, in order to obtain a comprehensive, non redundant report that satisfies the information need.Two main results are presented: a) the creation of an Information Synthesis testbed with 72 reports manually generated by nine subjects for eight complex topics with 100 relevant documents each; and b) an empirical comparison of similarity metrics between reports, under the hypothesis that the best metric is the one that best distinguishes between manual and automatically generated reports. A metric based on key concepts overlap gives better results than metrics based on n-gram overlap (such as ROUGE) or sentence overlap.

References

[1]

P. Clarkson and R. Rosenfeld. 1997. Statistical language modeling using the CMU-Cambridge toolkit. In Proceeding of Eurospeech '97, Rhodes, Greece.

Google Scholar

[2]

J. Goldstein, V. O. Mittal, J. G. Carbonell, and J. P. Callan. 2000. Creating and Evaluating Multi-Document Sentence Extract Summaries. In Proceedings of Ninth International Conferences on Information Knowledge Management (CIKM'00), pages 165--172, McLean, VA.

Digital Library

Google Scholar

[3]

H. V. Halteren and S. Teufel. 2003. Examining the Consensus between Human Summaries: Initial Experiments with Factoids Analysis. In HLT/NAACL-2003 Workshop on Automatic Summarization, Edmonton, Canada.

Digital Library

Google Scholar

[4]

V. Khandelwal, R. Gupta, and J. Allan. 2001. An Evaluation Corpus for Temporal Summarization. In Proceedings of the First International Conference on Human Language Technology Research (HLT 2001), Tolouse, France.

Digital Library

Google Scholar

[5]

C. Lin and E. H. Hovy. 2003. Automatic Evaluation of Summaries Using N-gram Co-ocurrence Statistics. In Proceeding of the 2003 Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada.

Digital Library

Google Scholar

[6]

I. Mani. 2001. Automatic Summarization, volume 3 of Natural Language Processing. John Benjamins Publishing Company, Amsterdam/Philadelphia.

Google Scholar

[7]

C. D. Manning and H. Schutze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge Mass.

Digital Library

Google Scholar

[8]

P. Over. 2003. Introduction to DUC-2003: An Intrinsic Evaluation of Generic News Text Summarization Systems. In Proceedings of Workshop on Automatic Summarization (DUC 2003).

Google Scholar

[9]

K. Papineni, S. Roukos, T. Ward, and W. Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 311--318, Philadelphia.

Digital Library

Google Scholar

[10]

C. Peters, M. Braschler, J. Gonzalo, and M. Kluck, editors. 2002. Evaluation of Cross-Language Information Retrieval Systems, volume 2406 of Lecture Notes in Computer Science. Springer-Verlag, Berlin-Heidelberg-New York.

Google Scholar

[11]

D. R. Radev, J. Hongyan, and M. Budzikowska. 2000. Centroid-Based Summarization of Multiple Documents: Sentence Extraction, Utility-Based Evaluation, and User Studies. In Proceedings of the Workshop on Automatic Summarization at the 6th Applied Natural Language Processing Conference and the 1st Conference of the North American Chapter of the Association for Computational Linguistics, Seattle, WA, April.

Digital Library

Google Scholar

Cited By

View all

Chali YHasan SImam K(2011)Using semantic information to answer complex questionsProceedings of the 24th Canadian conference on Advances in artificial intelligence10.5555/2018192.2018201(68-73)Online publication date: 25-May-2011
https://dl.acm.org/doi/10.5555/2018192.2018201
Fiszman MDemner-Fushman DKilicoglu HRindflesch T(2009)Automatic summarization of MEDLINE citations for evidence-based medical treatmentJournal of Biomedical Informatics10.1016/j.jbi.2008.10.00242:5(801-813)Online publication date: 1-Oct-2009
https://dl.acm.org/doi/10.1016/j.jbi.2008.10.002
Amigó EMartinez-Romo JAraujo LPeinado V(2008)UNED at WebCLEF 2008Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access10.5555/1813809.1813930(798-801)Online publication date: 17-Sep-2008
https://dl.acm.org/doi/10.5555/1813809.1813930
Show More Cited By

Recommendations

Information synthesis: a new approach to explore secondary information in scientific literature
JCDL '05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries

Advances in both technology and publishing practices continue to increase the quantity of scientific literature that is available electronically. In this paper, we introduce the Information Synthesis process, a new approach that enables scientists to ...
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
An empirical study of slice-based cohesion and coupling metrics

Software reengineering is a costly endeavor, due in part to the ambiguity of where to focus reengineering effort. Coupling and Cohesion metrics, particularly quantitative cohesion metrics, have the potential to aid in this identification and to measure ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ACL '04: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics

July 2004

729 pages

General Chair:
Donia Scott
ITRI, University of Brighton

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 21 July 2004

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
316
Total Downloads

Downloads (Last 12 months)45
Downloads (Last 6 weeks)10

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Chali YHasan SImam K(2011)Using semantic information to answer complex questionsProceedings of the 24th Canadian conference on Advances in artificial intelligence10.5555/2018192.2018201(68-73)Online publication date: 25-May-2011
https://dl.acm.org/doi/10.5555/2018192.2018201
Fiszman MDemner-Fushman DKilicoglu HRindflesch T(2009)Automatic summarization of MEDLINE citations for evidence-based medical treatmentJournal of Biomedical Informatics10.1016/j.jbi.2008.10.00242:5(801-813)Online publication date: 1-Oct-2009
https://dl.acm.org/doi/10.1016/j.jbi.2008.10.002
Amigó EMartinez-Romo JAraujo LPeinado V(2008)UNED at WebCLEF 2008Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access10.5555/1813809.1813930(798-801)Online publication date: 17-Sep-2008
https://dl.acm.org/doi/10.5555/1813809.1813930
Dang HChua TGoldstein JTeufel SVanderwende L(2006)DUC 2005Proceedings of the Workshop on Task-Focused Summarization and Question Answering10.5555/1654679.1654689(48-55)Online publication date: 23-Jul-2006
https://dl.acm.org/doi/10.5555/1654679.1654689
Hachey BMurray GReitter DChua TGoldstein JTeufel SVanderwende L(2006)Dimensionality reduction aids term co-occurrence based multi-document summarizationProceedings of the Workshop on Task-Focused Summarization and Question Answering10.5555/1654679.1654681(1-7)Online publication date: 23-Jul-2006
https://dl.acm.org/doi/10.5555/1654679.1654681
Lin J(2006)The role of information retrieval in answering complex questionsProceedings of the COLING/ACL on Main conference poster sessions10.5555/1273073.1273141(523-530)Online publication date: 17-Jul-2006
https://dl.acm.org/doi/10.5555/1273073.1273141
Lin JDemner-Fushman DMoore R(2006)Will pyramids built of nuggets topple over?Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics10.3115/1220835.1220884(383-390)Online publication date: 4-Jun-2006
https://dl.acm.org/doi/10.3115/1220835.1220884
Demner-Fushman DLin J(2006)Answer extraction, semantic clustering, and extractive summarization for clinical question answeringProceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics10.3115/1220175.1220281(841-848)Online publication date: 17-Jul-2006
https://dl.acm.org/doi/10.3115/1220175.1220281
Otterbacher JErkan GRadev DMooney R(2005)Using random walks for question-focused sentence retrievalProceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing10.3115/1220575.1220690(915-922)Online publication date: 6-Oct-2005
https://dl.acm.org/doi/10.3115/1220575.1220690
Amigó EGonzalo JPeñas AVerdejo FKnight K(2005)QARLAProceedings of the 43rd Annual Meeting on Association for Computational Linguistics10.3115/1219840.1219875(280-289)Online publication date: 25-Jun-2005
https://dl.acm.org/doi/10.3115/1219840.1219875
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Recommendations

Information synthesis: a new approach to explore secondary information in scientific literature

Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study

An empirical study of slice-based cohesion and coupling metrics

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations