[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3477495.3531971acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Open access

Entity-aware Transformers for Entity Search

Published: 07 July 2022 Publication History

Abstract

Pre-trained language models such as BERT have been a key ingredient to achieve state-of-the-art results on a variety of tasks in natural language processing and, more recently, also in information retrieval. Recent research even claims that BERT is able to capture factual knowledge about entity relations and properties, the information that is commonly obtained from knowledge graphs. This paper investigates the following question: Do BERT-based entity retrieval models benefit from additional entity information stored in knowledge graphs? To address this research question, we map entity embeddings into the same input space as a pre-trained BERT model and inject these entity embeddings into the BERT model. This entity-enriched language model is then employed on the entity retrieval task. We show that the entity-enriched BERT model improves effectiveness on entity-oriented queries over a regular BERT model, establishing a new state-of-the-art result for the entity retrieval task, with substantial improvements for complex natural language queries and queries requesting a list of entities with a certain property. Additionally, we show that the entity information provided by our entity-enriched model particularly helps queries related to less popular entities. Last, we observe empirically that the entity-enriched BERT models enable fine-tuning on limited training data, which otherwise would not be feasible due to the known instabilities of BERT in few-sample fine-tuning, thereby contributing to data-efficient training of BERT for entity search.

References

[1]
Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual String Embeddings for Sequence Labeling. In Proc. of 27th International Conference on Computational Linguistics (COLING '18). 1638--1649.
[2]
Krisztian Balog. 2018. Entity-Oriented Search. The Information Retrieval Series, Vol. 39. Springer.
[3]
Roi Blanco, Giuseppe Ottaviano, and Edgar Meij. 2015. Fast and Space-Efficient Entity Linking in Queries. Proc. of the Eighth ACM International Conference on Web Search and Data Mining (2015), 179--188.
[4]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proc. of the 26th International Conference on Neural Information Processing Systems (NeurIPS '13). 2787--2795.
[5]
Samuel Broscheit. 2019. Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking. In Proc. of the 23rd Conference on Computational Natural Language Learning (CoNLL '19). 677--685.
[6]
Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2021. Autoregressive Entity Retrieval. In Proc. of the International Conference on Learning Representations (ICLR '21).
[7]
Marco Cornolti, Paolo Ferragina, Massimiliano Ciaramita, Stefan Rüd, and Hinrich Schütze. 2016. A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries. In Proc. of the 25th International Conference on World Wide Web (WWW '16). 567--578.
[8]
Jeffrey Dalton, Chenyan, and Jamie Callan. 2020. TREC CAsT 2019: The conversational assistance track overview. arXiv preprint (2020). arXiv:2003.13624
[9]
Jeffrey Dalton, Laura Dietz, and James Allan. 2014. Entity Query Feature Expansion Using Knowledge Base Links. In Proc. of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '14). 365--374.
[10]
Daniel Daza, Michael Cochez, and Paul Groth. 2021. Inductive Entity Representations from Text via Link Prediction. In Proc. of the Web Conference 2021 (WWW '21). 798--808.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of The North American Chapter of the Association for Computational Linguistics '19 (NAACL '19). 4171--4186.
[12]
Laura Dietz. 2019. ENT Rank: Retrieving Entities for Topical Information Needs through Entity-Neighbor-Text Relations. In Proc. of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). 215--224.
[13]
Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, and Noah Smith. 2020. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping. arXiv preprint arXiv:2002.06305 (2020).
[14]
Paolo Ferragina and Ugo Scaiella. 2010. TAGME: On-the-Fly Annotation of Short Text Fragments (by Wikipedia Entities). In Proc. of the International Conference on Information and Knowledge Management (CIKM '10). 1625--1628.
[15]
Darío Garigliotti, Faegheh Hasibi, and Krisztian Balog. 2019. Identifying and Exploiting Target Entity Type Information for Ad hoc Entity Retrieval. Information Retrieval Journal 22, 3 (2019), 285--323.
[16]
Emma Gerritse, Faegheh Hasibi, and Arjen De Vries. 2020. Graph-Embedding Empowered Entity Retrieval. In Proc. of European Conference on Information Retrieval (ECIR '20).
[17]
Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg. 2015. Entity Linking in Queries: Tasks and Evaluation. In Proc. of the 2015 International Conference on The Theory of Information Retrieval (ICTIR '15). 171--180.
[18]
Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg. 2016. Exploiting Entity Linking in Queries for Entity Retrieval. In Proc. of the 2016 ACM International Conference on the Theory of Information Retrieval (ICTIR '16). 209--218.
[19]
Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg. 2017. Entity Linking in Queries: Efficiency vs. Effectiveness. In Proc. of the 39th European Conference on Information Retrieval (ECIR '17). 40--53.
[20]
Faegheh Hasibi, Krisztian Balog, Da?io Garigliotti, and Shuo Zhang. 2017. Nordlys: A toolkit for entity-oriented and semantic search. In Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). 1289--1292.
[21]
Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan. 2017. DBpedia-Entity V2: A Test Collection for Entity Search. In Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '19). 1265--1268.
[22]
Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, and Allan Hanbury. 2020. Improving Efficient Neural Ranking Models with CrossArchitecture Knowledge Distillation. arXiv preprint (2020). arXiv:2010.02666
[23]
Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, and Graham Neubig. 2020. X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models. In Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP '20). 5943--5959.
[24]
Hideaki Joko, Emma J Gerritse, Faegheh Hasibi, and Arjen P de Vries. 2022. Radboud University at TREC CAsT 2021. In TREC.
[25]
Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, and Omer Levy. 2020. SpanBERT: Improving Pre-training by Representing and Predicting Spans. Transactions of the Association for Computational Linguistics (2020), 64--77.
[26]
Mandar Joshi, Omer Levy, Luke Zettlemoyer, and Daniel Weld. 2019. BERT for Coreference Resolution: Baselines and Analysis. In Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP '19). 5803--5808.
[27]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint (2019). arXiv:1907.11692
[28]
Leland McInnes, John Healy, Nathaniel Saul, and Lukas Grossberger. 2018. UMAP: Uniform Manifold Approximation and Projection. The Journal of Open Source Software 3, 29 (2018), 861.
[29]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Proc. of Advances in Neural Information Processing Systems (NeurIPS '13). 3111--3119.
[30]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. In Proc. of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016.
[31]
Fedor Nikolaev and Alexander Kotov. 2020. Joint Word and Entity Embeddings for Entity Retrieval from a Knowledge Graph. In Proc. of the European Conference on Information Retrieval (ECIR '20). 141--155.
[32]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint (2019). arXiv:1901.04085
[33]
Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, and Jimmy Lin. 2019. Multi-stage document ranking with BERT. arXiv preprint (2019). arXiv:1910.14424
[34]
Matthew E. Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A. Smith. 2019. Knowledge Enhanced Contextual Word Representations. In Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP '19). 43--54.
[35]
Fabio Petroni, Tim Rocktäschel, Patrick S. H. Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, and Sebastian Riedel. 2019. Language Models as Knowledge Bases?. In Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP '19). 2463--2473.
[36]
Jason Phang, Thibault Fevry, and Samuel R. Bowman. 2019. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks. arXiv preprint arXiv:1811.01088 (2019).
[37]
Nina Poerner, Ulli Waltinger, and Hinrich Schütze. 2020. E-BERT: EfficientYet-Effective Entity Embeddings for BERT. In Findings of the Association for Computational Linguistics (ELMNLP '20). 803--818.
[38]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67.
[39]
Alon Talmor, Yanai Elazar, Yoav Goldberg, and Jonathan Berant. 2020. oLMpics - On what Language Model Pre-training Captures. Transactions of the Association for Computational Linguistics 8 (2020), 743--758.
[40]
Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, and Iryna Gurevych. 2021. BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. In Proc. of 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (NeurIPS '21).
[41]
Johannes M. van Hulst, Faegheh Hasibi, Koen Dercksen, Krisztian Balog, and Arjen P. de Vries. 2020. REL: An Entity Linker Standing on the Shoulders of Giants. In Proc. of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '20).
[42]
Chenguang Wang, Xiao Liu, and Dawn Song. 2020. Language Models are Open Knowledge Graphs. arXiv preprint (2020). arXiv:2010.11967
[43]
Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Jianshu ji, Guihong Cao, Daxin Jiang, and Ming Zhou. 2020. K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters. arXiv preprint (2020). arXiv:2002.01808
[44]
Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, and Ming Zhou. 2020. MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. In Proc. of Advances in Neural Information Processing Systems (NeurIPS '21), H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). 5776--5788.
[45]
Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2019. KEPLER: A unified model for knowledge embedding and pre-trained language representation. arXiv preprint (2019). arXiv:1911.06136
[46]
Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, and Bing Xiang. 2019. Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering. In Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP '19). 5878--5882.
[47]
Chenyan Xiong, Jamie Callan, and Tie-Yan Liu. 2017. Word-Entity Duet Representations for Document Ranking. In Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). 763--772.
[48]
Chenyan Xiong, Russell Power, and Jamie Callan. 2017. Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding. In Proc. of the 26th International Conference on World Wide Web (WWW '17). 1271--1279.
[49]
Ikuya Yamada, Akari Asai, Jin Sakuma, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji, and Yuji Matsumoto. 2020. Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia. In Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP '20). 23--30.
[50]
Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In Proc. of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP '20). 6442--6454.
[51]
Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda, and Yoshiyasu Takefuji. 2016. Joint learning of the embedding of words and entities for named entity disambiguation. In Proc. of The SIGNLL Conference on Computational Natural Language Learning (SIGNLL '16). 250--259.
[52]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems (NeurIPS '19). 5753--5763.
[53]
Tianyi Zhang, Felix Wu, Arzoo Katiyar, Kilian Q Weinberger, and Yoav Artzi. 2021. Revisiting Few-sample BERT Fine-tuning. In Proc. of 2021 International Conference on Learning Representations (ICLR '20).
[54]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proc. of the 57th Annual Meeting of the Association for Computational Linguistics (ACL '19). 1441--1451.

Cited By

View all
  • (2024)Fine Tuning vs. Retrieval Augmented Generation for Less Popular KnowledgeProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698415(12-22)Online publication date: 8-Dec-2024
  • (2024)Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge GraphProceedings of the ACM Web Conference 202410.1145/3589334.3645676(1519-1528)Online publication date: 13-May-2024
  • (2024)Learning contextual representations for entity retrievalApplied Intelligence10.1007/s10489-024-05430-054:19(8820-8840)Online publication date: 4-Jul-2024
  • Show More Cited By

Index Terms

  1. Entity-aware Transformers for Entity Search

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2022
    3569 pages
    ISBN:9781450387323
    DOI:10.1145/3477495
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2022

    Check for updates

    Author Tags

    1. bert
    2. entity embeddings
    3. entity retrieval
    4. transformers

    Qualifiers

    • Research-article

    Conference

    SIGIR '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)300
    • Downloads (Last 6 weeks)35
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Fine Tuning vs. Retrieval Augmented Generation for Less Popular KnowledgeProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698415(12-22)Online publication date: 8-Dec-2024
    • (2024)Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge GraphProceedings of the ACM Web Conference 202410.1145/3589334.3645676(1519-1528)Online publication date: 13-May-2024
    • (2024)Learning contextual representations for entity retrievalApplied Intelligence10.1007/s10489-024-05430-054:19(8820-8840)Online publication date: 4-Jul-2024
    • (2024)DREQ: Document Re-ranking Using Entity-Based Query UnderstandingAdvances in Information Retrieval10.1007/978-3-031-56027-9_13(210-229)Online publication date: 24-Mar-2024
    • (2023)A Purely Entity-Based Semantic Search Approach for Document RetrievalApplied Sciences10.3390/app13181028513:18(10285)Online publication date: 14-Sep-2023
    • (2023)Neuro-Symbolic Representations for Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3594246(3436-3439)Online publication date: 19-Jul-2023
    • (2023)MMEAD: MS MARCO Entity Annotations and DisambiguationsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591887(2817-2825)Online publication date: 19-Jul-2023
    • (2023)Who can verify this? Finding authorities for rumor verification in TwitterInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10336660:4Online publication date: 1-Jul-2023
    • (2022)Dense Retrieval with Entity ViewsProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557285(1955-1964)Online publication date: 17-Oct-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media