[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1571941.1572157acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
demonstration

Sifting micro-blogging stream for events of user interest

Published: 19 July 2009 Publication History

Abstract

Micro-blogging is a new form of social communication that encourages users to share information about anything they are seeing or doing, the motivation facilitated by the ability to post brief text messages through a variety of devices. Twitter, the most popular micro-blogging tool, is exhibiting rapid growth [3]: up to 11% of online Americans are using Twitter by December 2008, compared to 6% in May 2008. Due to its nature, micro-blogosphere has unique features: (i) It is a source of extremely up-to-date information about what is happening in the world; (ii) It captures the wisdom of millions of people and covers a broad range of domains. These features make micro-blogosphere more than a popular medium of social communication: we believe that it has additionally become a valuable source of extremely up-to-date news on virtually any subject of user interest. Making use of micro-blogosphere in this new role we meet the following challenges: (A) Since any given subject is generally mentioned in the micro-blogging stream on the continuous basis, a method is needed for locating periods of news on this subject. (B) Additionally, even for such periods, stream filtering is required for removing noise and for extracting messages that best describe the news. To address these challenges we make and exploit the following observations: (A) For an arbitrary subject, events that catch user interest gain distinguishably more attention than the average mentioning of the subject resulting in message activity bursts for it. (B) Most of the messages in an activity burst describe common event in close variations - either rephrased or "retweeted" between the users. We demonstrate TweetSieve - a system that allows obtaining news on any given subject by sifting the Twitter stream. Our work is related to frequecy-based analysis applied to blogs [1], but higher latency and lower coverage in blogs makes the analysis less effective than in case of micro-blogs. In TweetSieve demo, the user is able to express the subject of her interest by an arbitrary search string. The system shows the period of events occuring for the subject and outputs tweets that best describe each of the events. Figure 1 shows a screenshot of the system for "Semantic search" as a sample subject. The underlying process consists of two steps: Identifying activity bursts. Counting the messages matching the search string in the stream over time, the frequency curve is constructed. Activity bursts in the curve are identified by taking the periods of frequency exceeding the standard deviation from the average. Selecting messages that best describe news events. For the set of all messages matching the search string in an activity burst, we apply the message-granular variation of our keyphrase extraction algorithm [2] that is specifically suited to efficiently filtering noisy data. The algorithm clusters messages with respect to their similarity to each other and chooses central messages from the most dense clusters. As the similarity measure we use Jaccard coefficient for the "bag of words" representation of messages. The demonstration illustrates the potential of our approach in bringing news acquisition to a new level of promptness and coverage range.

References

[1]
N. Bansal and N. Koudas. Blogscope: A system for online analysis of high volume text streams. In VLDB, pages 1410--1413. ACM, 2007.
[2]
M. Grineva, M. Grinev, and D. Lizorkin. Extracting key terms from noisy and multitheme documents. Accepted to the World Wide Web conference (WWW'09).
[3]
A. Lenhart and S. Fox. Twitter and status updating. Pew Internet&American Life Project, Feb 2009.

Cited By

View all
  • (2020)Automatic 3D Modeling and Reconstruction of Cultural Heritage Sites from Twitter ImagesSustainability10.3390/su1210422312:10(4223)Online publication date: 21-May-2020
  • (2020)A framework for understanding online group behaviors during a catastrophic eventInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2019.10205151:COnline publication date: 1-Apr-2020
  • (2019)A Document Ranking Method With Query-Related Web ContextIEEE Access10.1109/ACCESS.2019.29471667(150168-150174)Online publication date: 2019
  • Show More Cited By

Index Terms

  1. Sifting micro-blogging stream for events of user interest

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
      July 2009
      896 pages
      ISBN:9781605584836
      DOI:10.1145/1571941

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 July 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. micro-blogging
      2. news
      3. twitter

      Qualifiers

      • Demonstration

      Conference

      SIGIR '09
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)Automatic 3D Modeling and Reconstruction of Cultural Heritage Sites from Twitter ImagesSustainability10.3390/su1210422312:10(4223)Online publication date: 21-May-2020
      • (2020)A framework for understanding online group behaviors during a catastrophic eventInternational Journal of Information Management: The Journal for Information Professionals10.1016/j.ijinfomgt.2019.10205151:COnline publication date: 1-Apr-2020
      • (2019)A Document Ranking Method With Query-Related Web ContextIEEE Access10.1109/ACCESS.2019.29471667(150168-150174)Online publication date: 2019
      • (2017)Machine Learning and Semantic Sentiment Analysis based Algorithms for Suicide Sentiment Prediction in Social NetworksProcedia Computer Science10.1016/j.procs.2017.08.290113(65-72)Online publication date: 2017
      • (2016)Event Detection in Twitter MicrobloggingIEEE Transactions on Cybernetics10.1109/TCYB.2015.248984146:12(2810-2824)Online publication date: Dec-2016
      • (2015)TwitterTrendsMultimedia Systems10.1007/s00530-013-0342-021:1(73-86)Online publication date: 1-Feb-2015
      • (2014)Topic Extraction Based on Knowledge Cluster in the Field of Micro-blogIntelligent Computing Methodologies10.1007/978-3-319-09339-0_55(542-550)Online publication date: 2014
      • (2013)Topical Discussions on Unstructured Microblogs: Analysis from a Geographical PerspectiveWeb Information Systems Engineering – WISE 201310.1007/978-3-642-41154-0_12(160-173)Online publication date: 2013
      • (2013)Mining user interest and its evolution for recommendation on the micro-blogging systemProceedings of the 14th international conference on Web-Age Information Management10.1007/978-3-642-38562-9_69(679-690)Online publication date: 14-Jun-2013
      • (2013)Discovery and analysis of evolving topical social discussions on unstructured microblogsProceedings of the 35th European conference on Advances in Information Retrieval10.1007/978-3-642-36973-5_46(545-556)Online publication date: 24-Mar-2013
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media