More Web Proxy on the site http://driver.im/

tutorial

Open access

Pretrained Transformers for Text Ranking: BERT and Beyond

Authors:

Rodrigo Nogueira,

Jimmy LinAuthors Info & Claims

WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Pages 1154 - 1156

https://doi.org/10.1145/3437963.3441667

Published: 08 March 2021 Publication History

Abstract

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications. This tutorial, based on a forthcoming book, provides an overview of text ranking with neural network architectures known as transformers, of which BERT is the best-known example. The combination of transformers and self-supervised pretraining has, without exaggeration, revolutionized the fields of natural language processing (NLP), information retrieval (IR), and beyond. We provide a synthesis of existing work as a single point of entry for both researchers and practitioners. Our coverage is grouped into two categories: transformer models that perform reranking in multi-stage ranking architectures and learned dense representations that perform ranking directly. Two themes pervade our treatment: techniques for handling long documents and techniques for addressing the tradeoff between effectiveness (result quality) and efficiency (query latency). Although transformer architectures and pretraining techniques are recent innovations, many aspects of their application are well understood. Nevertheless, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, we also attempt to prognosticate the future.

References

[1]

Zeynep Akkalyoncu Yilmaz, Wei Yang, Haotian Zhang, and Jimmy Lin. 2019. Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 3490--3496.

[2]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arXiv:2005.14165 (2020).

[3]

Zhuyun Dai and Jamie Callan. 2019 a. Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval. arXiv:1910.10687 (2019).

[4]

Zhuyun Dai and Jamie Callan. 2019 b. Deeper Text Understanding for IR with Contextual Neural Language Modeling. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019). Paris, France, 985--988.

Digital Library

[5]

Zhuyun Dai and Jamie Callan. 2020. Context-Aware Document Term Weighting for Ad-Hoc Search. In Proceedings of The Web Conference 2020 (WWW '20). 1897--1907.

Digital Library

[6]

Cedric De Boom, Steven Van Canneyt, Thomas Demeester, and Bart Dhoedt. 1999. Representation Learning for Very Short Texts Using Weighted Word Embedding Aggregation. Pattern Recognition Letters, Vol. 80, C (1999), 150--156.

Digital Library

[7]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota, 4171--4186.

[8]

Cicero Nogueira dos Santos, Xiaofei Ma, Ramesh Nallapati, Zhiheng Huang, and Bing Xiang. 2020. Beyond [CLS] through Ranking by Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1722--1727.

[9]

Luyu Gao, Zhuyun Dai, and Jamie Callan. 2020 a. Understanding BERT Rankers Under Distillation. In Proceedings of the 2020 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR 2020). 149--152.

Digital Library

[10]

Luyu Gao, Zhuyun Dai, Zhen Fan, and Jamie Callan. 2020 b. Complementing Lexical Retrieval with Semantic Residual Embedding. arXiv:2004.13969 (2020).

[11]

Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun hsuan Sung, Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, and Ray Kurzweil. 2017. Efficient Natural Language Response Suggestion for Smart Reply. arXiv:1705.00652 (2017).

[12]

Sebastian Hofst"atter, Sophia Althammer, Michael Schröder, Mete Sertkan, and Allan Hanbury. 2020 a. Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation. arXiv:2010.02666 (2020).

[13]

Sebastian Hofst"atter, Markus Zlabinger, and Allan Hanbury. 2020 b. Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking. In Proceedings of the 24th European Conference on Artificial Intelligence (ECAI 2020). Santiago de Compostela, Spain.

[14]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning Deep Structured Semantic Models for Web Search using Clickthrough Data. In Proceedings of 22nd International Conference on Information and Knowledge Management (CIKM 2013). San Francisco, California, 2333--2338.

Digital Library

[15]

Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2020. Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring. In Proceedings of the 8th International Conference on Learning Representations (ICLR 2020) .

[16]

Vladimir Karpukhin, Barlas Ouguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. arXiv:2004.04906 (2020).

[17]

Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020). 39--48.

Digital Library

[18]

Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy, 6086--6096.

[19]

Canjia Li, Andrew Yates, Sean MacAvaney, Ben He, and Yingfei Sun. 2020. PARADE: Passage Representation Aggregation for Document Reranking. arXiv:2008.09093 (2020).

[20]

Jimmy Lin, Rodrigo Nogueira, and Andrew Yates. 2020 a. Pretrained Transformers for Text Ranking: BERT and Beyond. arXiv:2010.06467 (2020).

[21]

Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2020 b. Distilling Dense Representations for Ranking using Tightly-Coupled Teachers. arXiv:2010.11386 (2020).

[22]

Wenhao Lu, Jian Jiao, and Ruofei Zhang. 2020. TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval. arXiv:2002.06275 (2020).

[23]

Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, and Ophir Frieder. 2020. Expansion via Prediction of Importance with Contextualization. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020). 1573--1576.

Digital Library

[24]

Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian. 2019. CEDR: Contextualized Embeddings for Document Ranking. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019). Paris, France, 1101--1104.

Digital Library

[25]

Yoshitomo Matsubara, Thuy Vu, and Alessandro Moschitti. 2020. Reranking for Efficient Transformer-Based Answer Selection. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020). 1577--1580.

Digital Library

[26]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26 (NIPS 2013). Lake Tahoe, California, 3111--3119.

[27]

Bhaskar Mitra, Sebastian Hofstatter, Hamed Zamani, and Nick Craswell. 2020. Conformer-Kernel with Query Term Independence for Document Retrieval. arXiv:2007.10434 (2020).

[28]

Bhaskar Mitra, Eric Nalisnick, Nick Craswell, and Rich Caruana. 2016. A Dual Embedding Space Model for Document Ranking. arXiv:1602.01137v1 (2016).

[29]

Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv:1901.04085 (2019).

[30]

Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of the Association for Computational Linguistics: EMNLP 2020. 708--718.

[31]

Rodrigo Nogueira and Jimmy Lin. 2019. From doc2query to docTTTTTquery. (2019).

[32]

Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, and Jimmy Lin. 2019 a. Multi-Stage Document Ranking with BERT. In arXiv:1910.14424 .

[33]

Rodrigo Nogueira, Wei Yang, Jimmy Lin, and Kyunghyun Cho. 2019 b. Document Expansion by Query Prediction. In arXiv:1904.08375 .

[34]

Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2020. RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering. arXiv:2010.08191 (2020).

[35]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 3982--3992.

[36]

Luca Soldaini and Alessandro Moschitti. 2020. The Cascade Transformer: an Application for Efficient Answer Sentence Selection. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5697--5708.

[37]

Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, and Jason Weston. 2018. StarSpace: Embed All The Things!. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018) .

[38]

Zhijing Wu, Jiaxin Mao, Yiqun Liu, Jingtao Zhan, Yukun Zheng, Min Zhang, and Shaoping Ma. 2020. Leveraging Passage-Level Cumulative Gain for Document Ranking. In Proceedings of The Web Conference 2020 (WWW '20). 2421--2431.

Digital Library

[39]

Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. arXiv:2007.00808 (2020).

[40]

Hamed Zamani, Mostafa Dehghani, W. Bruce Croft, Erik Learned-Miller, and Jaap Kamps. 2018. From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018). Torino, Italy, 497--506.

Digital Library

[41]

Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2020. RepBERT: Contextualized Text Embeddings for First-Stage Retrieval. arXiv:2006.15498 (2020).

Cited By

Zhang YHara T(2024)Extracting Political Interest Model from Interaction Data Based on Novel Word-level Bias AssignmentACM Transactions on Intelligent Systems and Technology10.1145/370264916:1(1-21)Online publication date: 31-Oct-2024
https://dl.acm.org/doi/10.1145/3702649
Dai SZhou YPang LLiu WHu XLiu YZhang XWang GXu JBaeza-Yates RBonchi F(2024)Neural Retrievers are Biased Towards LLM-Generated ContentProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671882(526-537)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671882
Ye DLiu JFan JTian BZhou TChen XMa JBaeza-Yates RBonchi F(2024)Enhancing Asymmetric Web Search through Question-Answer Generation and RankingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671517(6127-6136)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671517
Show More Cited By

Index Terms

Pretrained Transformers for Text Ranking: BERT and Beyond
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Pretrained Transformers for Text Ranking: BERT and Beyond
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing ...
RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pretrained language models such as BERT have been shown to be exceptionally effective for text ranking. However, there are limited studies on how to leverage more powerful sequence-to-sequence models such as T5. Existing attempts usually formulate text ...
Beyond text querying and ranking list: how people are searching through faceted catalogs in two library environments
ASIS&T '10: Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47

This paper reports the result of a transaction log analysis on two faceted library catalogs (University of North Carolina at Chapel Hill (UNC) Library catalog and Phoenix Public Library (PPL) catalog). The goal is to investigate people's searching ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

March 2021

1192 pages

ISBN:9781450382977

DOI:10.1145/3437963

General Chairs:
Liane Lewin-Eytan
Amazon, Israel
,
David Carmel
Amazon, Israel
,
Elad Yom-Tov
Microsoft, Israel
,
Program Chairs:
Eugene Agichtein
Emory University and Amazon, USA
,
Evgeniy Gabrilovich
Google Health, USA

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 March 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial

Funding Sources

Natural Sciences and Engineering Research Council of Canada
Canada First Research Excellence Fund

Conference

WSDM '21

Sponsor:

WSDM '21: The Fourteenth ACM International Conference on Web Search and Data Mining

March 8 - 12, 2021

Virtual Event, Israel

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

52
Total Citations
View Citations
4,065
Total Downloads

Downloads (Last 12 months)683
Downloads (Last 6 weeks)109

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang YHara T(2024)Extracting Political Interest Model from Interaction Data Based on Novel Word-level Bias AssignmentACM Transactions on Intelligent Systems and Technology10.1145/370264916:1(1-21)Online publication date: 31-Oct-2024
https://dl.acm.org/doi/10.1145/3702649
Dai SZhou YPang LLiu WHu XLiu YZhang XWang GXu JBaeza-Yates RBonchi F(2024)Neural Retrievers are Biased Towards LLM-Generated ContentProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671882(526-537)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671882
Ye DLiu JFan JTian BZhou TChen XMa JBaeza-Yates RBonchi F(2024)Enhancing Asymmetric Web Search through Question-Answer Generation and RankingProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671517(6127-6136)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671517
Kumarasinghe ULekssays ASencar HBoughorbel SElvitigala CNakov PQuek TGao DZhou JCardenas A(2024)Semantic Ranking for Automated Adversarial Technique Annotation in Security TextProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3645000(49-62)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3645000
Maddirala HS V NV NMeleet MBM SGuntur RVala V(2024)Custom Architecture for Effective Semantic App Search: A Systematic Approach2024 8th International Conference on Computational System and Information Technology for Sustainable Solutions (CSITSS)10.1109/CSITSS64042.2024.10816842(1-6)Online publication date: 7-Nov-2024
https://doi.org/10.1109/CSITSS64042.2024.10816842
Aldahmash HAlothaim AMirza A(2024)Identifying Learning Leaders in Online Social Networks Based on Community of Practice Theoretical Framework and Information EntropyIEEE Access10.1109/ACCESS.2024.344645412(116622-116636)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3446454
Shi HSakai T(2024)Enhancing Parameter Efficiency in Model Inference Using an Ultralight Inter-Transformer Linear StructureIEEE Access10.1109/ACCESS.2024.337851812(43734-43746)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3378518
Yu BTang FErgu DZeng RMa BLiu F(2024)Efficient Classification of Malicious URLs: M-BERT—A Modified BERT Variant for Enhanced Semantic UnderstandingIEEE Access10.1109/ACCESS.2024.335709512(13453-13468)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3357095
Noreen AMuneer INawab R(2024)Mono-lingual text reuse detection for the Urdu language at lexical levelEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109003136(109003)Online publication date: Oct-2024
https://doi.org/10.1016/j.engappai.2024.109003
Zhang YYao LHara T(2024)Integrating Social Environment in Machine Learning Model for Debiased RecommendationMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-031-63992-0_14(219-230)Online publication date: 19-Jul-2024
https://doi.org/10.1007/978-3-031-63992-0_14
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents