More Web Proxy on the site http://driver.im/

research-article

Multimedia News Summarization in Search

Authors:

Hanqing LuAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 7, Issue 3

Article No.: 33, Pages 1 - 20

https://doi.org/10.1145/2822907

Published: 01 February 2016 Publication History

Abstract

It is a necessary but challenging task to relieve users from the proliferative news information and allow them to quickly and comprehensively master the information of the whats and hows that are happening in the world every day. In this article, we develop a novel approach of multimedia news summarization for searching results on the Internet, which uncovers the underlying topics among query-related news information and threads the news events within each topic to generate a query-related brief overview. First, the hierarchical latent Dirichlet allocation (hLDA) model is introduced to discover the hierarchical topic structure from query-related news documents, and a new approach based on the weighted aggregation and max pooling is proposed to identify one representative news article for each topic. One representative image is also selected to visualize each topic as a complement to the text information. Given the representative documents selected for each topic, a time-bias maximum spanning tree (MST) algorithm is proposed to thread them into a coherent and compact summary of their parent topic. Finally, we design a friendly interface to present users with the hierarchical summarization of their required news information. Extensive experiments conducted on a large-scale news dataset collected from multiple news Web sites demonstrate the encouraging performance of the proposed solution for news summarization in news retrieval.

References

[1]

David M. Blei, Thomas L. Griffiths, and Michael I. Jordan. 2010. The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchical. Journal of the ACM 57, 2, 1--30.

Digital Library

[2]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, 4--5, 993--1022.

[3]

Gemma A. Calvert. 2001. Cross-modal processing in the human brain: Insights from functional neuron imaging studies. Cerebral Cortex 11, 12, 1110--1123.

[4]

Juan Cao, Jintao Li, Yongdong Zhang, and Sheng Tang. 2007. LDA-based retrieval framework for semantic news video retrieval. In Proceedings of the IEEE International Conference on Semantic Computing. IEEE, Los Alamitos, CA, 155--160.

Digital Library

[5]

Asli Celikyilmaz and Dilek Hakkani-Yur. 2010. A hybrid hierarchical model for multi-document summarization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 815--824.

Digital Library

[6]

Bernard Chazelle. 2000. A minimum spanning tree algorithm with inverse-Ackermann type complexity. Journal of the Association for Computing Machinery 47, 6, 1028--1047.

Digital Library

[7]

Yuanhao Chen, Benyu Zhang, and Hongjiang Zhang. 2009. Weighted aggregation based clustering algorithm for blog tag taxonomy construction. Journal of Chinese Computer Systems 30, 7, 1293--1297.

[8]

Janara Christensen, Stephen Soderland, Gagan Bansal, and Mausam. 2014. Hierarchical summarization: Scaling up multi-document summarization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. 902--912.

[9]

M. Cohen and D. Massaro. 1990. Synthesis of visible speech. Behavior Research Methods: Instruments and Computers 22, 2, 260--263.

[10]

Asif A. Ghazanfar and Charles E. Schroeder. 2006. Is neocortex essentially multisensory? Trends in Cognitive Sciences 10, 6, 278--285.

[11]

Thomas Hofmann. 1999. Probabilistic latent semantic indexing. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 50--57.

Digital Library

[12]

Richang Hong, Jinhui Tang, Hung-Khoon Tan, Chong-Wah Ngo, Shuicheng Yan, and Tat-Seng Chua. 2010. Beyond search: Event driven summarization for Web videos. ACM Transactions on Multimedia Computing, Communications, and Applications 7, 4, 35.

Digital Library

[13]

Binxing Jiao, Linjun Yang, Jizheng Xu, Qi Tian, and Feng Wu. 2012. Visually summarizing Web pages through internal and external images. IEEE Transactions on Multimedia 14, 6, 1673--1683.

Digital Library

[14]

Bruce M. King and Edward M. Minium. 1999. Statistical Reasoning in Psychology and Education. Wiley, New York, NY.

[15]

Jey Han Lau, Timothy Baldwin, and David Newman. 2013. On collocations and topic models. ACM Transactions on Speech and Language Processing 10, 3, 10.

Digital Library

[16]

Zechao Li, Jing Liu, Jinhui Tang, and Hanqing Lu. 2015. Robust structured subspace learning for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 10, 2085--2098.

Digital Library

[17]

Zechao Li, Jing Liu, Meng Wang, Changsheng Xu, and Hanqing Lu. 2013. Enhancing news organization for convenient retrieval and browsing. ACM Transactions on Multimedia Computing, Communications, and Applications 10, 1, 1.

Digital Library

[18]

Zechao Li, Jing Liu, Xiaobin Zhu, and Hanqing Lu. 2010. Multi-modal multi-correlation person-centric news retrieval. In Proceedings of the ACM International Conference on Information and Knowledge Management. ACM, New York, NY, 179--188.

Digital Library

[19]

Zechao Li, Meng Wang, Jing Liu, Changsheng Xu, and Hanqing Lu. 2011. News contextualization with geographic and visual information. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 133--142.

Digital Library

[20]

Jianhua Lin. 1991. Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory 37, 1, 145--151.

Digital Library

[21]

I. Mani and E. Bloedorn. 1997. Multi-document summarization by graph search and matching. In Proceedings of the National Conference on Artificial Intelligence. 622--628.

Digital Library

[22]

I. Mani and M. Maybury. 1999. Advances in Automatic Text Summarization. MIT Press, Cambridge, MA.

Digital Library

[23]

Chris Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA.

Digital Library

[24]

Harry McGurk and John MacDonald. 1976. Hearing lips and seeing voices. Nature 264, 5588, 746--748.

[25]

Kathleen R. McKeown, Regina Barzilay, David Evans, Vasileios Hatzivassiloglou, Judith L. Klavans, Ani Nenkova, Carl Sable, Barry Schiffman, and Sergey Sigelman. 2002. Tracking and summarizing news on a daily basis with Columbia’s Newsblaster. In Proceedings of the Human Language Technology Conference. 280--285.

Digital Library

[26]

Joel Larocca Neto, Alexandre D. Santos, Celso A. A. Kaestner, Neto Alexandre, D. Santos, Celso A. A, Kaestner Alex, Alex A. Freitas, and Catolica Parana. 2000. Document clustering and text summarization. In Proceedings of the International Conference on Practical Applications of Knowledge Discovery and Data Mining. 41--55.

[27]

Mikael Nilsson, Jörgen Nordberg, and Ingvar Claesson. 2007. Face detection using local SMQT features and split up SNoW classifier. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, Los Alamitos, CA, 589--592.

[28]

C. D. Paice. 1990. Constructing literature abstracts by computer: Techniques and prospects. Information Processing and Management 26, 1, 171--186.

Digital Library

[29]

Sriram Pemmaraju and Steven Skiena. 2003. Computational Discrete Mathematics: Combinatorics and Graph Theory in Mathematica. Cambridge University Press, Cambridge, England.

Digital Library

[30]

Xuan-Hieu Phan, Cam-Tu Nguyen, Dieu-Thu Le, Le-Minh Nguyen, Susumu Horiguchi, and Quang-Thuy Ha. 2011. A hidden topic-based framework toward building applications with short Web documents. IEEE Transactions on Knowledge and Data Engineering 23, 7, 961--976.

Digital Library

[31]

D. Radev. 2000. A common theory of information fusion from multiple sources step one: Cross-document structure. In Proceedings of the ACL SIGDIAL Workshop on Discourse and Dialogue. 74--83.

Digital Library

[32]

Dragomir Radev, Jahna Otterbacher, Adam Winkel, and Sasha Blair-Goldensohn. 2005. NewsInEssence: Summarizing online news topics. Communications of the ACM 48, 10, 95--98.

Digital Library

[33]

D. R. Radev, S. Blair-Goldensohn, Z. Zhang, and R. S. Raghavan. 2001. Interactive, domain-independent identification and summarization of topically related news articles. In Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries. 225--238.

Digital Library

[34]

D. R. Radev and W. Fan. 2000. Automatic summarization of search engine hit lists. In Proceedings of the ACL Workshop on Recent Advances in Natural Language Processing and Information Retrieval. 99--109.

Digital Library

[35]

D. R. Radev and K. R. McKeown. 1999. Generating natural language summaries from multiple on-line sources. Computational Linguistics 24, 3, 469--500.

Digital Library

[36]

Dafna Shahaf, Carlos Guestrin, and Eric Horvitz. 2012. Trains of thought: Generating information maps. In Proceedings of the ACM International Conference on World Wide Web. 899--908.

Digital Library

[37]

Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani. 2008. Summarizing visual data using bidirectional similarity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 1--8.

[38]

Ilija Subašić and Bettina Berendt. 2010. Experience stories: A visual news search and summarization system. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. 619--623.

Digital Library

[39]

Giang Binh Tran. 2013. Structured summarization for news events. In Proceedings of the ACM International Conference on World Wide Web Companion. ACM, New York, NY, 343--348.

Digital Library

[40]

D. Trieschnigg and W. Kraaij. 2004. TNO hierarchical topic detection report at TDT 2004. In Proceedings of the 7th Topic Detection and Tracking Conference.

[41]

D. Trieschnigg and W. Kraaij. 2005. Scalable hierarchical topic detection. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 655--656.

Digital Library

[42]

T. A. Tuan, S. Elbassuoni, N. Preda, and G. Weikum. 2011. Cate: Context-aware timeline for entity illustration. In Proceedings of the ACM International Conference on World Wide Web. ACM, New York, NY, 269--272.

Digital Library

[43]

Xing Wei and W. Bruce Croft. 2006. LDA-based document models for ad-hoc retrieval. In Proceedings of the ACM International Conference on Research and Development in Information Retrieval. ACM, New York, NY, 178--185.

Digital Library

[44]

R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. 2011. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In Proceedings of the ACM International Conference on Research and Development in Information Retrieval. ACM, New York, NY, 745--754.

Digital Library

[45]

Wan-Lei Zhao and Chong-Wah Ngo. 2009. Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection. IEEE Transactions on Image Processing 18, 2, 412--423.

Digital Library

Cited By

Wu YWang XChen TDou Y(2024)DA-ResNet: dual-stream ResNet with attention mechanism for classroom video summaryPattern Analysis & Applications10.1007/s10044-024-01256-127:2Online publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1007/s10044-024-01256-1
Jones SKlein MWeigle MNelson M(2023)Summarizing Web Archive Corpora via Social Media Storytelling by Automatically Selecting and Visualizing ExemplarsACM Transactions on the Web10.1145/360603018:1(1-48)Online publication date: 11-Oct-2023
https://dl.acm.org/doi/10.1145/3606030
Wünsche KKoesten LMöller TChen J(2023)Supporting Video Authoring for Communication of Research ResultsProceedings of the 2023 ACM International Conference on Interactive Media Experiences10.1145/3573381.3596157(47-59)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3573381.3596157
Show More Cited By

Index Terms

Multimedia News Summarization in Search
1. Information systems
  1. Information retrieval

Recommendations

Correlating summarization of multi-source news with k-way graph bi-clustering

With the emergence of enormous amount of online news, it is desirable to construct text mining methods that can extract, compare and highlight similarities of them. In this paper, we explore the research issue and methodology of correlated summarization ...
Statistical Single-Document Summarization for Chinese News Articles
WAINA '12: Proceedings of the 2012 26th International Conference on Advanced Information Networking and Applications Workshops

Given huge amount of daily news articles, it would be helpful to users if the news reading time can be reduced. In this paper, we focus on single-document summarization for Chinese news articles with statistical methods. First, new vocabularies are ...
Abstractive Summarization of Russian News Learning on Quality Media
Analysis of Images, Social Networks and Texts
Abstract
Summarization is becoming a demanded task in the modern world of ever-increasing document flow. This task allows to compress existing text while maintaining all salient information. However, building a neural summarization model requires training ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 7, Issue 3

Regular Papers, Survey Papers and Special Issue on Recommender System Benchmarks

April 2016

472 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/2885506

Editor:
Yu Zheng
Microsoft Research, China

Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 2016

Accepted: 01 September 2015

Revised: 01 July 2015

Received: 01 October 2014

Published in TIST Volume 7, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

973 Program
National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
540
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu YWang XChen TDou Y(2024)DA-ResNet: dual-stream ResNet with attention mechanism for classroom video summaryPattern Analysis & Applications10.1007/s10044-024-01256-127:2Online publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1007/s10044-024-01256-1
Jones SKlein MWeigle MNelson M(2023)Summarizing Web Archive Corpora via Social Media Storytelling by Automatically Selecting and Visualizing ExemplarsACM Transactions on the Web10.1145/360603018:1(1-48)Online publication date: 11-Oct-2023
https://dl.acm.org/doi/10.1145/3606030
Wünsche KKoesten LMöller TChen J(2023)Supporting Video Authoring for Communication of Research ResultsProceedings of the 2023 ACM International Conference on Interactive Media Experiences10.1145/3573381.3596157(47-59)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3573381.3596157
Zhang WQi ZWang SSu CSu LHuang Q(2023)Temporal Dynamic Concept Modeling Network for Explainable Video Event RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356831219:6(1-22)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3568312
Xiong JZhou YZhang PXie LHuang WZha Y(2023)Look&listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech EnhancementIEEE Transactions on Multimedia10.1109/TMM.2022.319910925(5800-5812)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3199109
Xie JChen XZhang TZhang YLu SCesar PYang Y(2023)Multimodal-Based and Aesthetic-Guided Narrative Video SummarizationIEEE Transactions on Multimedia10.1109/TMM.2022.318339425(4894-4908)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3183394
Liu NSun XYu HYao FXu GFu K(2023)Abstractive Summarization for Video: A Revisit in Multistage Fusion Network With Forget GateIEEE Transactions on Multimedia10.1109/TMM.2022.315799325(3296-3310)Online publication date: 2023
https://doi.org/10.1109/TMM.2022.3157993
Mukherjee SJatowt AKumar RJangra ASaha S(2023)Can Multimodal Pointer Generator Transformers Produce Topically Relevant Summaries?2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10192022(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10192022
Song 宋 BWu 吴 HSong 宋 YJiang 蒋 GXia 夏 LWang 王 X(2023)Robustness of community networks against cascading failures with heterogeneous redistribution strategiesChinese Physics B10.1088/1674-1056/acd9c332:9(098905)Online publication date: 1-Sep-2023
https://doi.org/10.1088/1674-1056/acd9c3
Suryawanshi SGoswami APatil P(2023)FakeIDCA: Fake news detection with incremental deep learning based concept drift adaptionMultimedia Tools and Applications10.1007/s11042-023-16588-z83:10(28579-28594)Online publication date: 6-Sep-2023
https://doi.org/10.1007/s11042-023-16588-z
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents