[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2723372.2751522acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Microblog Entity Linking with Social Temporal Context

Published: 27 May 2015 Publication History

Abstract

Nowadays microblogging sites, such as Twitter and Chinese Sina Weibo, have established themselves as an invaluable information source, which provides a huge collection of manually-generated tweets with broad range of topics from daily life to breaking news. Entity linking is indispensable for understanding and maintaining such information, which in turn facilitates many real-world applications such as tweet clustering and classification, personalized microblog search, and so forth. However, tweets are short, informal and error-prone, rendering traditional approaches for entity linking in documents largely inapplicable. Recent work addresses this problem by utilising information from other tweets and linking entities in a batch manner. Nevertheless, the high computational complexity makes this approach infeasible for real-time applications given the high arrival rate of tweets. In this paper, we propose an efficient solution to link entities in tweets by analyzing their social and temporal context. Our proposed framework takes into consideration three features, namely entity popularity, entity recency, and user interest information embedded in social interactions to assist the entity linking task. Effective indexing structures along with incremental algorithms have also been developed to reduce the computation and maintenance costs of our approach. Experimental results based on real tweet datasets verify the effectiveness and efficiency of our proposals.

References

[1]
J. Teevan, D. Ramage, and M. R. Morris.#twittersearch: A comparison of microblog search and web search. In WSDM, pages 35--44, 2011.
[2]
W. Shen, J. Wang, P. Luo, and M. Wang. Linking named entities in tweets with knowledge base via user interest modeling. In KDD, pages 68--76, 2013.
[3]
R. Mihalcea and A. Csomai. Wikify!: Linking documents to encyclopedic knowledge. In CIKM, pages 233--242, 2007.
[4]
D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM, pages 509--518, 2008.
[5]
X. Han and J. Zhao. Named entity disambiguation by leveraging wikipedia semantic knowledge. In CIKM, pages 215--224, 2009.
[6]
X. Han and J. Zhao. Structural semantic relatedness: A knowledge-based method to named entity disambiguation. In ACL, pages 50--59, 2010.
[7]
S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In KDD, pages 457--466, 2009.
[8]
X. Han, L. Sun, and J. Zhao. Collective entity linking in web text: A graph-based method. In SIGIR, pages 765--774, 2011.
[9]
W. Shen, J. Wang, P. Luo, and M. Wang. Linden: Linking named entities with knowledge base via semantic knowledge. In WWW, pages 449--458, 2012.
[10]
E. Meij, W. Weerkamp, and M. de Rijke. Adding semantics to microblog posts. In WSDM, pages 563--572, 2012.
[11]
A. Davis, A. Veloso, A. S. da Silva, W. Meira Jr., and A. H. F. Laender. Named entity disambiguation in streaming data. In ACL, pages 815--824, 2012.
[12]
X. Liu, Y. Li, H. Wu, M. Zhou, F. Wei, and Y. Lu. Entity linking for tweets. In ACL, pages 1304--1311, 2013.
[13]
Y. Li, C. Wang, F. Han, J. Han, D. Roth, and X. Yan. Mining evidences for named entity disambiguation. In KDD, pages 1070--1078, 2013.
[14]
P. Ferragina and U. Scaiella. Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In CIKM, pages 1625--1628, 2010.
[15]
A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: Understanding microblogging usage and communities. In WebKDD/SNA-KDD, pages 56--65, 2007.
[16]
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network or a news media? In WWW, pages 591--600, 2010.
[17]
L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In VLDB, pages 493--504, 2005.
[18]
S. Tribl and U. Leser. Fast and practical indexing and querying of very large graphs. In SIGMOD, pages 845--856, 2007.
[19]
H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: Scalable reachability index for large graphs. PVLDB, 3(1--2):276--284, 2010.
[20]
R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD, pages 253--262, 1989.
[21]
H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In ICDE, pages 75--75, 2006.
[22]
Y. Chen and Y. Chen. An efficient algorithm for answering graph reachability queries. In ICDE, pages 893--902, 2008.
[23]
R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In SIGMOD, pages 595--608, 2008.
[24]
S. J. van Schaik and O. de Moor. A memory efficient reachability data structure through bit vector compression. In SIGMOD, pages 913--924, 2011.
[25]
Y. Chen and Y. Chen. Decomposing dags into spanning trees: A new way to compress transitive closures. In ICDE, pages 1007--1018, 2011.
[26]
S. Seufert, A. Anand, S. Bedathur, and G. Weikum. Ferrari: Flexible and efficient reachability range assignment for graph indexing. In ICDE, pages 1009--1020, 2013.
[27]
J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, pages 193--204, 2008.
[28]
R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3hop: A high-compression indexing scheme for reachability query. In SIGMOD, pages 813--826, 2009.
[29]
J. Cai and C. K. Poon. Path-hop: Efficiently indexing large graphs for reachability queries. In CIKM, pages 119--128, 2010.
[30]
J. Cheng, Z. Shang, H. Cheng, H. Wang, and J. X. Yu. K-reach: Who is in your small world. PVLDB, 5(11):1292--1303, 2012.
[31]
J. Cheng, S. Huang, H. Wu, and A. W. Fu. Tf-label: A topological-folding labeling scheme for reachability querying in a large graph. In SIGMOD, pages 193--204, 2013.
[32]
Y. Yano, T. Akiba, Y. Iwata, and Y. Yoshida. Fast and scalable reachability queries on graphs by pruned labeling with landmarks and paths. In CIKM, pages 1601--1606, 2013.
[33]
T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In SIGMOD, pages 349--360, 2013.
[34]
R. Jin and G. Wang. Simple, fast, and scalable reachability oracle. PVLDB, 6(14):1978--1989, 2013.
[35]
I. Witten and D. Milne. An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In AAAI, pages 25--30, 2008.
[36]
G. Li, J. Hu, J. Feng, and K. Tan. Effective location identification from microblogs. In ICDE, pages 880--891, 2014.
[37]
C. Li, J. Weng, Q. He, Y. Yao, A. Datta, A. Sun, and B. Lee. Twiner: Named entity recognition in targeted twitter stream. In SIGIR, pages 721--730, 2012.
[38]
D. M. de Oliveira, A. H. F. Laender, A. Veloso, and A. S. da Silva. Fs-ner: A lightweight filter-stream approach to named entity recognition on twitter data. In WWW, pages 597--604, 2013.
[39]
K. Takeuchi and N. Collier. Use of support vector machines in extended named entity recognition. In CONLL, pages 1--7, 2002.
[40]
H. Isozaki and H. Kazawa. Efficient support vector classifiers for named entity recognition. In COLING, pages 1--7, 2002.
[41]
H. L. Chieu and H. T. Ng. Named entity recognition: A maximum entropy approach using global information. In COLING, pages 1--7, 2002.
[42]
O. Bender, F. J. Och, and H. Ney. Maximum entropy models for named entity recognition. In CONLL, pages 148--151, 2003.
[43]
J. R. Curran and S. Clark. Language independent ner using a maximum entropy tagger. In CONLL, pages 164--167, 2003.
[44]
G. Zhou and J. Su. Named entity recognition using an hmm-based chunk tagger. In ACL, pages 473--480, 2002.
[45]
A. McCallum and W. Li. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In CONLL, pages 188--191, 2003.
[46]
A. Ritter, S. Clark, Mausam, and O. Etzioni. Named entity recognition in tweets: An experimental study. In EMNLP, pages 1524--1534, 2011.
[47]
X. Liu, S. Zhang, F. Wei, and M. Zhou. Recognizing named entities in tweets. In HLT, pages 359--367, 2011.
[48]
X. Liu, M. Zhou, F. Wei, Z. Fu, and X. Zhou. Joint inference of named entity recognition and normalization for tweets. In ACL, pages 526--535, 2012.

Cited By

View all
  • (2024)Analysis and Detection of "Pink Slime" Websites in Social Media PostsProceedings of the ACM Web Conference 202410.1145/3589334.3645588(2572-2581)Online publication date: 13-May-2024
  • (2024)An Inductive Reasoning Model based on Interpretable Logical Rules over temporal knowledge graphNeural Networks10.1016/j.neunet.2024.106219174(106219)Online publication date: Jun-2024
  • (2022)Toward Tweet Entity Linking With Heterogeneous Information NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.306809334:12(6003-6017)Online publication date: 1-Dec-2022
  • Show More Cited By

Index Terms

  1. Microblog Entity Linking with Social Temporal Context

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    May 2015
    2110 pages
    ISBN:9781450327589
    DOI:10.1145/2723372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. entity popularity
    2. entity recency
    3. microblog entity linking
    4. social temporal context
    5. user interest

    Qualifiers

    • Research-article

    Funding Sources

    • ARC Project

    Conference

    SIGMOD/PODS'15
    Sponsor:
    SIGMOD/PODS'15: International Conference on Management of Data
    May 31 - June 4, 2015
    Victoria, Melbourne, Australia

    Acceptance Rates

    SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 31 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Analysis and Detection of "Pink Slime" Websites in Social Media PostsProceedings of the ACM Web Conference 202410.1145/3589334.3645588(2572-2581)Online publication date: 13-May-2024
    • (2024)An Inductive Reasoning Model based on Interpretable Logical Rules over temporal knowledge graphNeural Networks10.1016/j.neunet.2024.106219174(106219)Online publication date: Jun-2024
    • (2022)Toward Tweet Entity Linking With Heterogeneous Information NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.306809334:12(6003-6017)Online publication date: 1-Dec-2022
    • (2022)Towards the Inference of Travel Purpose with Heterogeneous Urban DataIEEE Transactions on Big Data10.1109/TBDATA.2019.29218238:1(166-177)Online publication date: 1-Feb-2022
    • (2021)TQELProceedings of the VLDB Endowment10.14778/3476249.347630914:11(2642-2654)Online publication date: 27-Oct-2021
    • (2021)Medical Entity Disambiguation Using Graph Neural NetworksProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457328(2310-2318)Online publication date: 9-Jun-2021
    • (2021)Joint Open Knowledge Base Canonicalization and LinkingProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452776(2253-2261)Online publication date: 9-Jun-2021
    • (2021)Entity Linking Meets Deep Learning: Techniques and SolutionsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3117715(1-1)Online publication date: 2021
    • (2021)TwiCS: Lightweight Entity Mention Detection in Targeted Twitter StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3088716(1-1)Online publication date: 2021
    • (2021)Learning to rank implicit entities on TwitterInformation Processing & Management10.1016/j.ipm.2021.10250358:3(102503)Online publication date: May-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media