More Web Proxy on the site http://driver.im/

research-article

Microblog Entity Linking with Social Temporal Context

Authors:

Xiaofang ZhouAuthors Info & Claims

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

Pages 1761 - 1775

https://doi.org/10.1145/2723372.2751522

Published: 27 May 2015 Publication History

Abstract

Nowadays microblogging sites, such as Twitter and Chinese Sina Weibo, have established themselves as an invaluable information source, which provides a huge collection of manually-generated tweets with broad range of topics from daily life to breaking news. Entity linking is indispensable for understanding and maintaining such information, which in turn facilitates many real-world applications such as tweet clustering and classification, personalized microblog search, and so forth. However, tweets are short, informal and error-prone, rendering traditional approaches for entity linking in documents largely inapplicable. Recent work addresses this problem by utilising information from other tweets and linking entities in a batch manner. Nevertheless, the high computational complexity makes this approach infeasible for real-time applications given the high arrival rate of tweets. In this paper, we propose an efficient solution to link entities in tweets by analyzing their social and temporal context. Our proposed framework takes into consideration three features, namely entity popularity, entity recency, and user interest information embedded in social interactions to assist the entity linking task. Effective indexing structures along with incremental algorithms have also been developed to reduce the computation and maintenance costs of our approach. Experimental results based on real tweet datasets verify the effectiveness and efficiency of our proposals.

References

[1]

J. Teevan, D. Ramage, and M. R. Morris.#twittersearch: A comparison of microblog search and web search. In WSDM, pages 35--44, 2011.

Digital Library

[2]

W. Shen, J. Wang, P. Luo, and M. Wang. Linking named entities in tweets with knowledge base via user interest modeling. In KDD, pages 68--76, 2013.

Digital Library

[3]

R. Mihalcea and A. Csomai. Wikify!: Linking documents to encyclopedic knowledge. In CIKM, pages 233--242, 2007.

Digital Library

[4]

D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM, pages 509--518, 2008.

Digital Library

[5]

X. Han and J. Zhao. Named entity disambiguation by leveraging wikipedia semantic knowledge. In CIKM, pages 215--224, 2009.

Digital Library

[6]

X. Han and J. Zhao. Structural semantic relatedness: A knowledge-based method to named entity disambiguation. In ACL, pages 50--59, 2010.

Digital Library

[7]

S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In KDD, pages 457--466, 2009.

Digital Library

[8]

X. Han, L. Sun, and J. Zhao. Collective entity linking in web text: A graph-based method. In SIGIR, pages 765--774, 2011.

Digital Library

[9]

W. Shen, J. Wang, P. Luo, and M. Wang. Linden: Linking named entities with knowledge base via semantic knowledge. In WWW, pages 449--458, 2012.

Digital Library

[10]

E. Meij, W. Weerkamp, and M. de Rijke. Adding semantics to microblog posts. In WSDM, pages 563--572, 2012.

Digital Library

[11]

A. Davis, A. Veloso, A. S. da Silva, W. Meira Jr., and A. H. F. Laender. Named entity disambiguation in streaming data. In ACL, pages 815--824, 2012.

Digital Library

[12]

X. Liu, Y. Li, H. Wu, M. Zhou, F. Wei, and Y. Lu. Entity linking for tweets. In ACL, pages 1304--1311, 2013.

[13]

Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining evidences for named entity disambiguation. In KDD, pages 1070--1078, 2013.

Digital Library

[14]

P. Ferragina and U. Scaiella. Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In CIKM, pages 1625--1628, 2010.

Digital Library

[15]

A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: Understanding microblogging usage and communities. In WebKDD/SNA-KDD, pages 56--65, 2007.

Digital Library

[16]

H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In WWW, pages 591--600, 2010.

Digital Library

[17]

L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In VLDB, pages 493--504, 2005.

Digital Library

[18]

S. Tribl and U. Leser. Fast and practical indexing and querying of very large graphs. In SIGMOD, pages 845--856, 2007.

Digital Library

[19]

H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: Scalable reachability index for large graphs. PVLDB, 3(1--2):276--284, 2010.

Digital Library

[20]

R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD, pages 253--262, 1989.

Digital Library

[21]

H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE, pages 75--75, 2006.

Digital Library

[22]

Y. Chen and Y. Chen. An efficient algorithm for answering graph reachability queries. In ICDE, pages 893--902, 2008.

Digital Library

[23]

R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In SIGMOD, pages 595--608, 2008.

Digital Library

[24]

S. J. van Schaik and O. de Moor. A memory efficient reachability data structure through bit vector compression. In SIGMOD, pages 913--924, 2011.

Digital Library

[25]

Y. Chen and Y. Chen. Decomposing dags into spanning trees: A new way to compress transitive closures. In ICDE, pages 1007--1018, 2011.

Digital Library

[26]

S. Seufert, A. Anand, S. Bedathur, and G. Weikum. Ferrari: Flexible and efficient reachability range assignment for graph indexing. In ICDE, pages 1009--1020, 2013.

Digital Library

[27]

J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, pages 193--204, 2008.

Digital Library

[28]

R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3hop: A high-compression indexing scheme for reachability query. In SIGMOD, pages 813--826, 2009.

Digital Library

[29]

J. Cai and C. K. Poon. Path-hop: Efficiently indexing large graphs for reachability queries. In CIKM, pages 119--128, 2010.

Digital Library

[30]

J. Cheng, Z. Shang, H. Cheng, H. Wang, and J. X. Yu. K-reach: Who is in your small world. PVLDB, 5(11):1292--1303, 2012.

Digital Library

[31]

J. Cheng, S. Huang, H. Wu, and A. W. Fu. Tf-label: A topological-folding labeling scheme for reachability querying in a large graph. In SIGMOD, pages 193--204, 2013.

Digital Library

[32]

Y. Yano, T. Akiba, Y. Iwata, and Y. Yoshida. Fast and scalable reachability queries on graphs by pruned labeling with landmarks and paths. In CIKM, pages 1601--1606, 2013.

Digital Library

[33]

T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In SIGMOD, pages 349--360, 2013.

Digital Library

[34]

R. Jin and G. Wang. Simple, fast, and scalable reachability oracle. PVLDB, 6(14):1978--1989, 2013.

Digital Library

[35]

I. Witten and D. Milne. An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In AAAI, pages 25--30, 2008.

[36]

G. Li, J. Hu, J. Feng, and K. Tan. Effective location identification from microblogs. In ICDE, pages 880--891, 2014.

[37]

C. Li, J. Weng, Q. He, Y. Yao, A. Datta, A. Sun, and B. Lee. Twiner: Named entity recognition in targeted twitter stream. In SIGIR, pages 721--730, 2012.

Digital Library

[38]

D. M. de Oliveira, A. H. F. Laender, A. Veloso, and A. S. da Silva. Fs-ner: A lightweight filter-stream approach to named entity recognition on twitter data. In WWW, pages 597--604, 2013.

Digital Library

[39]

K. Takeuchi and N. Collier. Use of support vector machines in extended named entity recognition. In CONLL, pages 1--7, 2002.

Digital Library

[40]

H. Isozaki and H. Kazawa. Efficient support vector classifiers for named entity recognition. In COLING, pages 1--7, 2002.

Digital Library

[41]

H. L. Chieu and H. T. Ng. Named entity recognition: A maximum entropy approach using global information. In COLING, pages 1--7, 2002.

Digital Library

[42]

O. Bender, F. J. Och, and H. Ney. Maximum entropy models for named entity recognition. In CONLL, pages 148--151, 2003.

Digital Library

[43]

J. R. Curran and S. Clark. Language independent ner using a maximum entropy tagger. In CONLL, pages 164--167, 2003.

Digital Library

[44]

G. Zhou and J. Su. Named entity recognition using an hmm-based chunk tagger. In ACL, pages 473--480, 2002.

Digital Library

[45]

A. McCallum and W. Li. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In CONLL, pages 188--191, 2003.

Digital Library

[46]

A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In EMNLP, pages 1524--1534, 2011.

Digital Library

[47]

X. Liu, S. Zhang, F. Wei, and M. Zhou. Recognizing named entities in tweets. In HLT, pages 359--367, 2011.

Digital Library

[48]

X. Liu, M. Zhou, F. Wei, Z. Fu, and X. Zhou. Joint inference of named entity recognition and normalization for tweets. In ACL, pages 526--535, 2012.

Digital Library

Cited By

Aljebreen AMeng WDragut EChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Analysis and Detection of "Pink Slime" Websites in Social Media PostsProceedings of the ACM Web Conference 202410.1145/3589334.3645588(2572-2581)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645588
Mei XYang LJiang ZCai XGao DHan JPan S(2024)An Inductive Reasoning Model based on Interpretable Logical Rules over temporal knowledge graphNeural Networks10.1016/j.neunet.2024.106219174(106219)Online publication date: Jun-2024
https://doi.org/10.1016/j.neunet.2024.106219
Shen WYin YYang YHan JWang JYuan X(2022)Toward Tweet Entity Linking With Heterogeneous Information NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.306809334:12(6003-6017)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TKDE.2021.3068093
Show More Cited By

Index Terms

Microblog Entity Linking with Social Temporal Context
1. Information systems
  1. Information retrieval

Recommendations

Twitter summarization with social-temporal context

Twitter is one of the most popular social media platforms for online users to create and share information. Tweets are short, informal, and large-scale, which makes it difficult for online users to find reliable and useful information, arising the ...
What does software engineering community microblog about?
MSR '12: Proceedings of the 9th IEEE Working Conference on Mining Software Repositories

Microblogging is a new trend to communicate and to disseminate information. One microblog post could potentially reach millions of users. Millions of microblogs are generated on a daily basis on popular sites such as Twitter. The popularity of ...
Predicting lifespans of popular tweets in microblog
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

In microblog like Twitter, popular tweets are usually retweeted by many users. For different tweets, their lifespans (i.e., how long they will stay popular) vary. This paper presents a simple yet effective approach to predict the lifespans of popular ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

May 2015

2110 pages

ISBN:9781450327589

DOI:10.1145/2723372

General Chair:
Timos Sellis
RMIT University, Australia
,
Program Chairs:
Susan B. Davidson
University of Pennsylvania, USA
,
Zack Ives
University of Pennsylvania, USA

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ARC Project

Conference

SIGMOD/PODS'15

Sponsor:

SIGMOD

SIGMOD/PODS'15: International Conference on Management of Data

May 31 - June 4, 2015

Victoria, Melbourne, Australia

Acceptance Rates

SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
1,018
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Aljebreen AMeng WDragut EChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Analysis and Detection of "Pink Slime" Websites in Social Media PostsProceedings of the ACM Web Conference 202410.1145/3589334.3645588(2572-2581)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645588
Mei XYang LJiang ZCai XGao DHan JPan S(2024)An Inductive Reasoning Model based on Interpretable Logical Rules over temporal knowledge graphNeural Networks10.1016/j.neunet.2024.106219174(106219)Online publication date: Jun-2024
https://doi.org/10.1016/j.neunet.2024.106219
Shen WYin YYang YHan JWang JYuan X(2022)Toward Tweet Entity Linking With Heterogeneous Information NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.306809334:12(6003-6017)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TKDE.2021.3068093
Meng CCui YHe QSu LGao J(2022)Towards the Inference of Travel Purpose with Heterogeneous Urban DataIEEE Transactions on Big Data10.1109/TBDATA.2019.29218238:1(166-177)Online publication date: 1-Feb-2022
https://doi.org/10.1109/TBDATA.2019.2921823
Alsaudi AAltowim YMehrotra SYu Y(2021)TQELProceedings of the VLDB Endowment10.14778/3476249.347630914:11(2642-2654)Online publication date: 27-Oct-2021
https://dl.acm.org/doi/10.14778/3476249.3476309
Vretinaris ALei CEfthymiou VQin XÖzcan FLi GLi ZIdreos SSrivastava D(2021)Medical Entity Disambiguation Using Graph Neural NetworksProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457328(2310-2318)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3448016.3457328
Liu YShen WWang YWang JYang ZYuan XLi GLi ZIdreos SSrivastava D(2021)Joint Open Knowledge Base Canonicalization and LinkingProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452776(2253-2261)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3448016.3452776
Shen WLi YLiu YHan JWang JYuan X(2021)Entity Linking Meets Deep Learning: Techniques and SolutionsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3117715(1-1)Online publication date: 2021
https://doi.org/10.1109/TKDE.2021.3117715
Saha Bhowmick SDragut EMeng W(2021)TwiCS: Lightweight Entity Mention Detection in Targeted Twitter StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3088716(1-1)Online publication date: 2021
https://doi.org/10.1109/TKDE.2021.3088716
Hosseini HBagheri E(2021)Learning to rank implicit entities on TwitterInformation Processing & Management10.1016/j.ipm.2021.10250358:3(102503)Online publication date: May-2021
https://doi.org/10.1016/j.ipm.2021.102503
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten