More Web Proxy on the site http://driver.im/

research-article

Generating event storylines from microblogs

Authors:

Tao LiAuthors Info & Claims

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Pages 175 - 184

https://doi.org/10.1145/2396761.2396787

Published: 29 October 2012 Publication History

Abstract

Microblogging service has emerged to be a dominant web medium for billions of individuals sharing and spreading instant news and information, therefore monitoring the event evolution on microblog sphere is crucial for providing both better user experience and deeper understanding on real-time events. In this paper we explore the problem of generating storylines from microblogs for user input queries. This problem is challenging due to the sparse, dynamic and social nature of microblogs. Given a query of an ongoing event, we propose to sketch the real-time storyline of the event by a two-level solution. We first propose a language model with dynamic pseudo relevance feedback to obtain relevant tweets, and then generate storylines via graph optimization. Comprehensive experiments on Twitter data sets demonstrate the effectiveness of the proposed methods in each level and the overall framework.

References

[1]

J. Allan Introduction to Topic Detection and Tracking. In Topic Detection and Tracking 2002(12), pages 1--16.

Digital Library

[2]

N. Bansal and N. Koudas. Blogscope: a system for online analysis of high volume text streams. In Proceedings of VLDB 2007, pages 1410--1413.

Digital Library

[3]

A. Bermingham and A. F. Smeaton. Classifying sentiment in microblogs: is brevity an advantage? In Proceedings of CIKM 2010, pages 1833--1836.

Digital Library

[4]

G. Cao, J.-Y. Nie, J. Gao et al. and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of SIGIR 2008, pages 243--250.

Digital Library

[5]

C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In Proceedings of WWW 2011, pages 675--684.

Digital Library

[6]

C. Chen, F. Li, B. C. Ooi, et al. and S. Wu. Ti: an efficient indexing mechanism for real-time search on tweets. In Proceedings of SIGMOD 2011, pages 649--660.

Digital Library

[7]

M. Efron. Hashtag retrieval in a microblogging environment. In Proceedings of SIGIR 2010, pages 787--788.

Digital Library

[8]

M. Efron. Information search and retrieval in microblogs. J. Am. Soc. Inf. Sci. Technol., 62:996--1008, June 2011.

Digital Library

[9]

M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceedings of SIGIR 2011, pages 495--504.

Digital Library

[10]

J. Hannon, M. Bennett, and B. Smyth. Recommending twitter users to follow using content and collaborative filtering approaches. In Proceedings of RecSys 2010, pages 199--206.

Digital Library

[11]

R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans. Inf. Syst., 25, July 2007.

Digital Library

[12]

J. Kleinberg, Bursty and Hierarchical Structure in Streams In Data Mining and Knowledge Discovery 2003(7), pages 373--397.

Digital Library

[13]

R. Kumar, U. Mahadevan, and D. Sivakumar. A graph-theoretic approach to extract storylines from search results. In Proceedings of SIGKDD 2004, pages 216--225.

Digital Library

[14]

H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of WWW 2010, pages 591--600.

Digital Library

[15]

T. Lappas, B. Arai, M. Platakis, D. Kotsakos, and D. Gunopulos. On burstiness-aware search for document sequences. In Proceedings of SIGKDD 2009, pages 477--486.

Digital Library

[16]

V. Lavrenko, J. Allan, E. DeGuzman, D. LaFlamme, V. Pollard, and S. Thomas. Relevance models for topic detection and tracking In Proceedings of HLT 2002, pages 115--121.

Digital Library

[17]

D. Lee and H. Seung. Algorithms for non-negative matrix factorization. Advances in neural information processing systems, 13, 2001.

[18]

J. Li, L. Li, and T. Li. Mssf: a multi-document summarization framework based on submodularity. In Proceedings of SIGIR 2011, pages 1247--1248.

Digital Library

[19]

X. Li and W. B. Croft. Time-based language models. In Proceedings of CIKM 2003, pages 469--475.

Digital Library

[20]

C. Lin. Rouge: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), pages 25--26.

[21]

F. ren Lin and C.-H. Liang. Storyline-based summarization for news topic retrospection. Decision Support Systems, 45(3):473--490, 2008. (ce:title)Special Issue Clusters.

Digital Library

[22]

K. Massoudi, M. Tsagkias, M. de Rijke, and W. Weerkamp. Incorporating query expansion and quality indicators in searching microblog posts. In Proceedings of ECIR 2011, pages 362--367.

Digital Library

[23]

M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In Proceedings of SIGMOD 2010, pages 1155--1158.

Digital Library

[24]

Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceedings of SIGKDD 2005, pages 198--207.

Digital Library

[25]

S. Morinaga and K. Yamanishi. Tracking dynamics of topic trends using a finite mixture model. In Proceedings of SIGKDD 2004, pages 811--816.

Digital Library

[26]

I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A High Performance and Scalable Information Retrieval Platform. In Proceedings of ACM SIGIR-OSIR Workshop 2006.

[27]

W. Ruzzo and M. Tompa. A linear time algorithm for finding all maximal scoring subsequences. In Proceedings of ISMB 1999, pages 234--241.

Digital Library

[28]

T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of WWW 2010, pages 851--860.

Digital Library

[29]

D. A. Shamma, L. Kennedy, and E. F. Churchill. Peaks and persistence: modeling the shape of microblog conversations. In Proceedings of CSCW '11, pages 355--358.

Digital Library

[30]

C. Shen and T. Li. Multi-Document Summarization via the Minimum Dominating Set. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), 2010.

Digital Library

[31]

J. Shi and J. Malik. Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):888--905, 2000.

Digital Library

[32]

H. Takamura, H. Yokono, and M. Okumura. Summarizing a document stream. In Proceedings of ECIR 2011, pages 177--188.

Digital Library

[33]

J. Teevan, D. Ramage, and M. R. Morris.#twittersearch: a comparison of microblog search and web search. In Proceedings of WSDM 2011, pages 35--44.

Digital Library

[34]

D. Wang, T. Li, and M. Ogihara. Generating Pictorial Storylines via Minimum-Weight Connected Dominating Set Approximation in Multi-View Graphs. In Proceddings of AAAI 2012.

[35]

D. Wang, T. Li, S. Zhu, and C. Ding. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In Proceedings of SIGIR 2008, pages 307--314.

Digital Library

[36]

D. Wang, L. Zheng, T. Li, and Y. Deng. Evolutionary document summarization for disaster management. In Proceedings of SIGIR 2009, pages 680--681.

Digital Library

[37]

F. Wei, W. Li, Q. Lu, and Y. He. Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In Proceedings of SIGIR 2008, pages 283--290.

Digital Library

[38]

Y. Xu, G. J. Jones, and B. Wang. Query dependent pseudo-relevance feedback based on wikipedia. In Proceedings of SIGIR 2009, pages 59--66.

Digital Library

[39]

R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In Proceedings of SIGIR 2011, pages 745--754.

Digital Library

[40]

W. Yih, J. Goodman, L. Vanderwende, and H. Suzuki. Multi-document summarization by maximizing informative content-words. In Proceedings of IJCAI, 2007.

Digital Library

[41]

C. Zhai. Statistical language models for information retrieval a critical review. Foundations and Trends in Information Retrieval, 2(3):137--213, 2008.

Digital Library

[42]

D. Zhang, Y. Liu, R. D. Lawrence, and V. Chenthamarakshan. Transfer latent semantic learning: Microblog mining with less supervision. In Proceedings of AAAI 2011, pages 561--566.

Cited By

Han ZShi LLiu LJiang LFang JLin FZhang JPanneerselvam JAntonopoulos N(2024)A Survey on Event Tracking in Social Media Data StreamsBig Data Mining and Analytics10.26599/BDMA.2023.90200217:1(217-243)Online publication date: Mar-2024
https://doi.org/10.26599/BDMA.2023.9020021
Ge JShi Lliu LHan ZMiller A(2024)DIEET: Knowledge–Infused Event Tracking in Social Media based on Deep LearningPeer-to-Peer Networking and Applications10.1007/s12083-024-01677-z17:4(2047-2064)Online publication date: 17-Apr-2024
https://doi.org/10.1007/s12083-024-01677-z
Keith Norambuena BMitra TNorth C(2023)A Survey on Event-Based News Narrative ExtractionACM Computing Surveys10.1145/358474155:14s(1-39)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1145/3584741
Show More Cited By

Index Terms

Generating event storylines from microblogs
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
    2. Retrieval tasks and goals
      1. Clustering and classification
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

Comprehensive Event Storyline Generation from Microblogs
MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia

Microblogging data contains a wealth of information of trending events and has gained increased attention among users, organizations, and research scholars for social media mining in different disciplines. Event storyline generation is one typical task ...
Robust Spammer Detection in Microblogs: Leveraging User Carefulness
Survey Paper, Regular Papers and Special Issue: Social Media Processing

Microblogging Web sites, such as Twitter and Sina Weibo, have become popular platforms for socializing and sharing information in recent years. Spammers have also discovered this new opportunity to unfairly overpower normal users with unsolicited ...
Event photo mining from Twitter using keyword bursts and image clustering
Abstract
Twitter is a unique microblogging service which enables people to post and read not only short messages but also photos from anywhere. Since microblogs are different from traditional blogs in terms of timeliness and on-the-spot-ness, they include ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

October 2012

2840 pages

ISBN:9781450311564

DOI:10.1145/2396761

General Chair:
Xuewen Chen
Wayne State University, USA
,
Program Chairs:
Guy Lebanon
Georgia Institute of Technology
,
Haixun Wang
Microsoft Research Asia
,
Mohammed J. Zaki
Rensselaer Polytechnic Institute

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM'12

Sponsor:

CIKM'12: 21st ACM International Conference on Information and Knowledge Management

October 29 - November 2, 2012

Hawaii, Maui, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

68
Total Citations
View Citations
1,032
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)4

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Han ZShi LLiu LJiang LFang JLin FZhang JPanneerselvam JAntonopoulos N(2024)A Survey on Event Tracking in Social Media Data StreamsBig Data Mining and Analytics10.26599/BDMA.2023.90200217:1(217-243)Online publication date: Mar-2024
https://doi.org/10.26599/BDMA.2023.9020021
Ge JShi Lliu LHan ZMiller A(2024)DIEET: Knowledge–Infused Event Tracking in Social Media based on Deep LearningPeer-to-Peer Networking and Applications10.1007/s12083-024-01677-z17:4(2047-2064)Online publication date: 17-Apr-2024
https://doi.org/10.1007/s12083-024-01677-z
Keith Norambuena BMitra TNorth C(2023)A Survey on Event-Based News Narrative ExtractionACM Computing Surveys10.1145/358474155:14s(1-39)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1145/3584741
Zhao XWang JJin PWang CYang CLi BZhang H(2023)Automatically Generating Storylines from Microblogging PlatformsNeural Information Processing10.1007/978-981-99-1648-1_4(38-50)Online publication date: 15-Apr-2023
https://doi.org/10.1007/978-981-99-1648-1_4
Wang JZhao XJin PYang CLi BZhang H(2023)Storyline Generation from News Articles Based on Approximate Personalized Propagation of Neural PredictionsDatabase Systems for Advanced Applications10.1007/978-3-031-30678-5_3(37-52)Online publication date: 14-Apr-2023
https://doi.org/10.1007/978-3-031-30678-5_3
An NGui FJin LMing HYang J(2022)Toward Better Understanding Older Adults: A Biography Brief Timeline Extraction ApproachInternational Journal of Human–Computer Interaction10.1080/10447318.2022.207727839:5(1084-1095)Online publication date: 2-Jun-2022
https://doi.org/10.1080/10447318.2022.2077278
Gui FWu XHu MYang J(2022)Automatic Life Event Tree Generation for Older AdultsHCI International 2022 – Late Breaking Papers: HCI for Health, Well-being, Universal Access and Healthy Aging10.1007/978-3-031-17902-0_26(366-377)Online publication date: 26-Jun-2022
https://dl.acm.org/doi/10.1007/978-3-031-17902-0_26
Si JGuo LZhou D(2021)Unsupervised latent event representation learning and storyline extraction from news articles based on neural networksIntelligent Data Analysis10.3233/IDA-19506125:3(589-603)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.3233/IDA-195061
Marcelino GSemedo DMourão ABlasi SMagalhães JMrak MShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Assisting News Media Editors with Cohesive Visual StorylinesProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475476(3257-3265)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475476
Lin COuyang ZWang XLi HHuang Z(2021)Preserve Integrity in Realtime Event SummarizationACM Transactions on Knowledge Discovery from Data10.1145/344234415:3(1-29)Online publication date: 3-May-2021
https://dl.acm.org/doi/10.1145/3442344
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten