[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3331184.3331317acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

CEDR: Contextualized Embeddings for Document Ranking

Published: 18 July 2019 Publication History

Abstract

Although considerable attention has been given to neural ranking architectures recently, far less attention has been paid to the term representations that are used as input to these models. In this work, we investigate how two pretrained contextualized language models (ELMo and BERT) can be utilized for ad-hoc document ranking. Through experiments on TREC benchmarks, we find that several ex-sting neural ranking architectures can benefit from the additional context provided by contextualized language models. Furthermore, we propose a joint approach that incorporates BERT's classification vector into existing neural models and show that it outperforms state-of-the-art ad-hoc ranking baselines. We call this joint approach CEDR (Contextualized Embeddings for Document Ranking). We also address practical challenges in using these models for ranking, including the maximum input length imposed by BERT and runtime performance impacts of contextualized language models.

References

[1]
Zhuyun Dai, Chenyan Xiong, James P. Callan, and Zhiyuan Liu. 2018. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search. In WSDM .
[2]
Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In SIGIR .
[3]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition .
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018).
[5]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In CIKM .
[6]
Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2018. Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval. In WSDM .
[7]
Samuel Huston and W Bruce Croft. 2014. Parameters learned in the comparison of retrieval models using term dependencies. Technical Report (2014).
[8]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR .
[9]
Kui-Lam Kwok, Laszlo Grunfeld, H. L. Sun, and Peter Deng. 2004. TREC 2004 Robust Track Experiments Using PIRCS. In TREC .
[10]
Nut Limsopatham, Richard McCreadie, M-Dyaa Albakour, Craig MacDonald, Rodrygo L. T. Santos, and Iadh Ounis. 2012. University of Glasgow at TREC 2012: Experiments with Terrier. In TREC .
[11]
Xitong Liu, Peilin Yang, and Hui Fang. 2014. Entity Came to Rescue - Leveraging Entities to Minimize Risks in Web Search. In TREC .
[12]
Ryan McDonald, Yichun Ding, and Ion Androutsopoulos. 2018. Deep Relevance Ranking using Enhanced Document-Query Interactions. In EMNLP .
[13]
Donald Metzler and W. Bruce Croft. 2005. A Markov random field model for term dependencies. In SIGIR .
[14]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. CoRR, Vol. abs/1901.04085 (2019).
[15]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP .
[16]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proc. of NAACL .
[17]
Fiana Raiber and Oren Kurland. 2013. The Technion at TREC 2013 Web Track: Cluster-based Document Retrieval. In TREC .
[18]
Corby Rosset, Damien Jose, Gargi Ghosh, Bhaskar Mitra, and Saurabh Tiwary. 2018. Optimizing Query Evaluations Using Reinforcement Learning for Web Search. In SIGIR .
[19]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. In NIPS .
[20]
Chenyan Xiong, Zhuyun Dai, James P. Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. In SIGIR .
[21]
Peilin Yang, Hui Fang, and Jimmy Lin. 2017. Anserini: Enabling the Use of Lucene for Information Retrieval Research. In SIGIR .
[22]
Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. 2019. End-to-End Open-Domain Question Answering with BERTserini. CoRR, Vol. abs/1901.04085 (2019).
[23]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In NIPS .
[24]
Hamed Zamani, Mostafa Dehghani, Fernando Diaz, Hang Li, and Nick Craswell. 2018. SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval. In SIGIR .

Cited By

View all
  • (2025)ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational SearchIEEE Access10.1109/ACCESS.2025.352974113(15253-15271)Online publication date: 2025
  • (2025)Refining the Giants: A Comprehensive Review of Fine-Tuning Strategies for Large Language ModelsComputing and Machine Learning10.1007/978-981-97-7839-3_5(65-82)Online publication date: 24-Jan-2025
  • (2024)Direct Backpropagation Realization for A Neural Network including a DatabaseProceedings of the 2024 9th International Conference on Intelligent Information Technology10.1145/3654522.3654555(340-345)Online publication date: 23-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2019
1512 pages
ISBN:9781450361729
DOI:10.1145/3331184
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. contextualized word embeddings
  2. neural ranking

Qualifiers

  • Short-paper

Conference

SIGIR '19
Sponsor:

Acceptance Rates

SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)121
  • Downloads (Last 6 weeks)7
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational SearchIEEE Access10.1109/ACCESS.2025.352974113(15253-15271)Online publication date: 2025
  • (2025)Refining the Giants: A Comprehensive Review of Fine-Tuning Strategies for Large Language ModelsComputing and Machine Learning10.1007/978-981-97-7839-3_5(65-82)Online publication date: 24-Jan-2025
  • (2024)Direct Backpropagation Realization for A Neural Network including a DatabaseProceedings of the 2024 9th International Conference on Intelligent Information Technology10.1145/3654522.3654555(340-345)Online publication date: 23-Feb-2024
  • (2024)Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and ChallengesACM Computing Surveys10.1145/364847156:7(1-33)Online publication date: 9-Apr-2024
  • (2024)Predicting Representations of Information Needs from Digital Activity ContextACM Transactions on Information Systems10.1145/363981942:4(1-29)Online publication date: 15-Jan-2024
  • (2024)Towards Effective and Efficient Sparse Neural Information RetrievalACM Transactions on Information Systems10.1145/363491242:5(1-46)Online publication date: 29-Apr-2024
  • (2024)Efficient Neural Ranking Using Forward Indexes and Lightweight EncodersACM Transactions on Information Systems10.1145/363193942:5(1-34)Online publication date: 29-Apr-2024
  • (2024)Clinical Trial Retrieval via Multi-grained Similarity LearningProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661366(2950-2954)Online publication date: 10-Jul-2024
  • (2024)Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers?Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657979(2321-2326)Online publication date: 10-Jul-2024
  • (2024)A Reproducibility Study of PLAIDProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657856(1411-1419)Online publication date: 10-Jul-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media