[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2396761.2396787acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Generating event storylines from microblogs

Published: 29 October 2012 Publication History

Abstract

Microblogging service has emerged to be a dominant web medium for billions of individuals sharing and spreading instant news and information, therefore monitoring the event evolution on microblog sphere is crucial for providing both better user experience and deeper understanding on real-time events. In this paper we explore the problem of generating storylines from microblogs for user input queries. This problem is challenging due to the sparse, dynamic and social nature of microblogs. Given a query of an ongoing event, we propose to sketch the real-time storyline of the event by a two-level solution. We first propose a language model with dynamic pseudo relevance feedback to obtain relevant tweets, and then generate storylines via graph optimization. Comprehensive experiments on Twitter data sets demonstrate the effectiveness of the proposed methods in each level and the overall framework.

References

[1]
J. Allan Introduction to Topic Detection and Tracking. In Topic Detection and Tracking 2002(12), pages 1--16.
[2]
N. Bansal and N. Koudas. Blogscope: a system for online analysis of high volume text streams. In Proceedings of VLDB 2007, pages 1410--1413.
[3]
A. Bermingham and A. F. Smeaton. Classifying sentiment in microblogs: is brevity an advantage? In Proceedings of CIKM 2010, pages 1833--1836.
[4]
G. Cao, J.-Y. Nie, J. Gao et al. and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of SIGIR 2008, pages 243--250.
[5]
C. Castillo, M. Mendoza, and B. Poblete. Information credibility on twitter. In Proceedings of WWW 2011, pages 675--684.
[6]
C. Chen, F. Li, B. C. Ooi, et al. and S. Wu. Ti: an efficient indexing mechanism for real-time search on tweets. In Proceedings of SIGMOD 2011, pages 649--660.
[7]
M. Efron. Hashtag retrieval in a microblogging environment. In Proceedings of SIGIR 2010, pages 787--788.
[8]
M. Efron. Information search and retrieval in microblogs. J. Am. Soc. Inf. Sci. Technol., 62:996--1008, June 2011.
[9]
M. Efron and G. Golovchinsky. Estimation methods for ranking recent information. In Proceedings of SIGIR 2011, pages 495--504.
[10]
J. Hannon, M. Bennett, and B. Smyth. Recommending twitter users to follow using content and collaborative filtering approaches. In Proceedings of RecSys 2010, pages 199--206.
[11]
R. Jones and F. Diaz. Temporal profiles of queries. ACM Trans. Inf. Syst., 25, July 2007.
[12]
J. Kleinberg, Bursty and Hierarchical Structure in Streams In Data Mining and Knowledge Discovery 2003(7), pages 373--397.
[13]
R. Kumar, U. Mahadevan, and D. Sivakumar. A graph-theoretic approach to extract storylines from search results. In Proceedings of SIGKDD 2004, pages 216--225.
[14]
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In Proceedings of WWW 2010, pages 591--600.
[15]
T. Lappas, B. Arai, M. Platakis, D. Kotsakos, and D. Gunopulos. On burstiness-aware search for document sequences. In Proceedings of SIGKDD 2009, pages 477--486.
[16]
V. Lavrenko, J. Allan, E. DeGuzman, D. LaFlamme, V. Pollard, and S. Thomas. Relevance models for topic detection and tracking In Proceedings of HLT 2002, pages 115--121.
[17]
D. Lee and H. Seung. Algorithms for non-negative matrix factorization. Advances in neural information processing systems, 13, 2001.
[18]
J. Li, L. Li, and T. Li. Mssf: a multi-document summarization framework based on submodularity. In Proceedings of SIGIR 2011, pages 1247--1248.
[19]
X. Li and W. B. Croft. Time-based language models. In Proceedings of CIKM 2003, pages 469--475.
[20]
C. Lin. Rouge: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out (WAS 2004), pages 25--26.
[21]
F. ren Lin and C.-H. Liang. Storyline-based summarization for news topic retrospection. Decision Support Systems, 45(3):473--490, 2008. (ce:title)Special Issue Clusters.
[22]
K. Massoudi, M. Tsagkias, M. de Rijke, and W. Weerkamp. Incorporating query expansion and quality indicators in searching microblog posts. In Proceedings of ECIR 2011, pages 362--367.
[23]
M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In Proceedings of SIGMOD 2010, pages 1155--1158.
[24]
Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceedings of SIGKDD 2005, pages 198--207.
[25]
S. Morinaga and K. Yamanishi. Tracking dynamics of topic trends using a finite mixture model. In Proceedings of SIGKDD 2004, pages 811--816.
[26]
I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A High Performance and Scalable Information Retrieval Platform. In Proceedings of ACM SIGIR-OSIR Workshop 2006.
[27]
W. Ruzzo and M. Tompa. A linear time algorithm for finding all maximal scoring subsequences. In Proceedings of ISMB 1999, pages 234--241.
[28]
T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of WWW 2010, pages 851--860.
[29]
D. A. Shamma, L. Kennedy, and E. F. Churchill. Peaks and persistence: modeling the shape of microblog conversations. In Proceedings of CSCW '11, pages 355--358.
[30]
C. Shen and T. Li. Multi-Document Summarization via the Minimum Dominating Set. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), 2010.
[31]
J. Shi and J. Malik. Normalized cuts and image segmentation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):888--905, 2000.
[32]
H. Takamura, H. Yokono, and M. Okumura. Summarizing a document stream. In Proceedings of ECIR 2011, pages 177--188.
[33]
J. Teevan, D. Ramage, and M. R. Morris.#twittersearch: a comparison of microblog search and web search. In Proceedings of WSDM 2011, pages 35--44.
[34]
D. Wang, T. Li, and M. Ogihara. Generating Pictorial Storylines via Minimum-Weight Connected Dominating Set Approximation in Multi-View Graphs. In Proceddings of AAAI 2012.
[35]
D. Wang, T. Li, S. Zhu, and C. Ding. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In Proceedings of SIGIR 2008, pages 307--314.
[36]
D. Wang, L. Zheng, T. Li, and Y. Deng. Evolutionary document summarization for disaster management. In Proceedings of SIGIR 2009, pages 680--681.
[37]
F. Wei, W. Li, Q. Lu, and Y. He. Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization. In Proceedings of SIGIR 2008, pages 283--290.
[38]
Y. Xu, G. J. Jones, and B. Wang. Query dependent pseudo-relevance feedback based on wikipedia. In Proceedings of SIGIR 2009, pages 59--66.
[39]
R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In Proceedings of SIGIR 2011, pages 745--754.
[40]
W. Yih, J. Goodman, L. Vanderwende, and H. Suzuki. Multi-document summarization by maximizing informative content-words. In Proceedings of IJCAI, 2007.
[41]
C. Zhai. Statistical language models for information retrieval a critical review. Foundations and Trends in Information Retrieval, 2(3):137--213, 2008.
[42]
D. Zhang, Y. Liu, R. D. Lawrence, and V. Chenthamarakshan. Transfer latent semantic learning: Microblog mining with less supervision. In Proceedings of AAAI 2011, pages 561--566.

Cited By

View all
  • (2024)A Survey on Event Tracking in Social Media Data StreamsBig Data Mining and Analytics10.26599/BDMA.2023.90200217:1(217-243)Online publication date: Mar-2024
  • (2024)DIEET: Knowledge–Infused Event Tracking in Social Media based on Deep LearningPeer-to-Peer Networking and Applications10.1007/s12083-024-01677-z17:4(2047-2064)Online publication date: 17-Apr-2024
  • (2023)A Survey on Event-Based News Narrative ExtractionACM Computing Surveys10.1145/358474155:14s(1-39)Online publication date: 17-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
October 2012
2840 pages
ISBN:9781450311564
DOI:10.1145/2396761
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic pseudo relevance feedback
  2. language model
  3. microblog
  4. social media
  5. storyline

Qualifiers

  • Research-article

Conference

CIKM'12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Survey on Event Tracking in Social Media Data StreamsBig Data Mining and Analytics10.26599/BDMA.2023.90200217:1(217-243)Online publication date: Mar-2024
  • (2024)DIEET: Knowledge–Infused Event Tracking in Social Media based on Deep LearningPeer-to-Peer Networking and Applications10.1007/s12083-024-01677-z17:4(2047-2064)Online publication date: 17-Apr-2024
  • (2023)A Survey on Event-Based News Narrative ExtractionACM Computing Surveys10.1145/358474155:14s(1-39)Online publication date: 17-Jul-2023
  • (2023)Automatically Generating Storylines from Microblogging PlatformsNeural Information Processing10.1007/978-981-99-1648-1_4(38-50)Online publication date: 15-Apr-2023
  • (2023)Storyline Generation from News Articles Based on Approximate Personalized Propagation of Neural PredictionsDatabase Systems for Advanced Applications10.1007/978-3-031-30678-5_3(37-52)Online publication date: 14-Apr-2023
  • (2022)Toward Better Understanding Older Adults: A Biography Brief Timeline Extraction ApproachInternational Journal of Human–Computer Interaction10.1080/10447318.2022.207727839:5(1084-1095)Online publication date: 2-Jun-2022
  • (2022)Automatic Life Event Tree Generation for Older AdultsHCI International 2022 – Late Breaking Papers: HCI for Health, Well-being, Universal Access and Healthy Aging10.1007/978-3-031-17902-0_26(366-377)Online publication date: 26-Jun-2022
  • (2021)Unsupervised latent event representation learning and storyline extraction from news articles based on neural networksIntelligent Data Analysis10.3233/IDA-19506125:3(589-603)Online publication date: 1-Jan-2021
  • (2021)Assisting News Media Editors with Cohesive Visual StorylinesProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475476(3257-3265)Online publication date: 17-Oct-2021
  • (2021)Preserve Integrity in Realtime Event SummarizationACM Transactions on Knowledge Discovery from Data10.1145/344234415:3(1-29)Online publication date: 3-May-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media