[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

PARADE: Passage Representation Aggregation forDocument Reranking

Published: 27 September 2023 Publication History

Abstract

Pre-trained transformer models, such as BERT and T5, have shown to be highly effective at ad hoc passage and document ranking. Due to the inherent sequence length limits of these models, they need to process document passages one at a time rather than processing the entire document sequence at once. Although several approaches for aggregating passage-level signals into a document-level relevance score have been proposed, there has yet to be an extensive comparison of these techniques. In this work, we explore strategies for aggregating relevance signals from a document’s passages into a final ranking score. We find that passage representation aggregation techniques can significantly improve over score aggregation techniques proposed in prior work, such as taking the maximum passage score. We call this new approach PARADE. In particular, PARADE can significantly improve results on collections with broad information needs where relevance signals can be spread throughout the document (such as TREC Robust04 and GOV2). Meanwhile, less complex aggregation techniques may work better on collections with an information need that can often be pinpointed to a single passage (such as TREC DL and TREC Genomics). We also conduct efficiency analyses and highlight several strategies for improving transformer-based aggregation.

References

[1]
Qingyao Ai, Brendan O’Connor, and W. Bruce Croft. 2018. A neural passage model for ad-hoc document retrieval. In ECIR(Lecture Notes in Computer Science, Vol. 10772). Springer, 537–543.
[2]
Jimmy Ba and Rich Caruana. 2014. Do deep nets really need to be deep? In NIPS. 2654–2662.
[3]
Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. CoRR abs/1607.06450 (2016).
[4]
Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. CoRR abs/2004.05150 (2020).
[5]
Michael Bendersky and Oren Kurland. 2008. Utilizing passage-based language models for document retrieval. In ECIR(Lecture Notes in Computer Science, Vol. 4956). Springer, 162–174.
[6]
Yoshua Bengio, Aaron C. Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (2013), 1798–1828.
[7]
Leonid Boytsov, Tianyi Lin, Fangwei Gao, Yutian Zhao, Jeffrey Huang, and Eric Nyberg. 2022. Understanding performance of long-document ranking models through comprehensive evaluation and leaderboarding. CoRR abs/2207.01262 (2022).
[8]
James P. Callan. 1994. Passage-level evidence in document retrieval. In SIGIR. ACM/Springer, 302–310.
[9]
M. Catena, O. Frieder, Cristina Ioana Muntean, F. Nardini, R. Perego, and N. Tonellotto. 2019. Enhanced news retrieval: Passages lead the way! In SIGIR.
[10]
Xuanang Chen, Ben He, Kai Hui, Le Sun, and Yingfei Sun. 2020. Simplified TinyBERT: Knowledge distillation for document retrieval. CoRR abs/2009.07531 (2020).
[11]
Xuanang Chen, Ben He, Kai Hui, Yiran Wang, Le Sun, and Yingfei Sun. 2021. Contextualized offline relevance weighting for efficient and effective neural retrieval. In SIGIR. ACM, 1617–1621.
[12]
Xuanang Chen, Ben He, Le Sun, and Yingfei Sun. 2020. ICIP at TREC-2020 Deep Learning Track. In TREC(NIST Special Publication, Vol. 1266). National Institute of Standards and Technology (NIST).
[13]
Xuanang Chen, Canjia Li, Ben He, and Yingfei Sun. 2019. UCAS at TREC-2019 Deep Learning Track. In TREC.
[14]
Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. CoRR abs/1904.10509 (2019).
[15]
Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR. OpenReview.net.
[16]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, and Daniel Campos. 2020. Overview of the TREC 2020 Deep Learning Track. In TREC.
[17]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin. 2022. Overview of the TREC 2021 Deep Learning Track. In Text REtrieval Conference (TREC). TREC. Retrieved from https://www.microsoft.com/en-us/research/publication/overview-of-the-trec-2021-deep-learning-track/.
[18]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M. Voorhees. 2019. Overview of the TREC 2019 Deep Learning Track. In TREC.
[19]
Zhuyun Dai and Jamie Callan. 2019. Context-aware sentence/passage term importance estimation for first stage retrieval. CoRR abs/1910.10687 (2019).
[20]
Zhuyun Dai and Jamie Callan. 2019. Deeper text understanding for IR with contextual neural language modeling. In SIGIR. ACM, 985–988.
[21]
Zhuyun Dai, Chenyan Xiong, Jamie Callan, and Zhiyuan Liu. 2018. Convolutional neural networks for soft-matching n-grams in ad-hoc search. In WSDM. ACM, 126–134.
[22]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT.
[23]
Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Chengxiang Zhai, and Xueqi Cheng. 2018. Modeling diverse relevance patterns in ad-hoc retrieval. In SIGIR. ACM, 375–384.
[24]
Hui Fang, Tao Tao, and Chengxiang Zhai. 2011. Diagnostic evaluation of information retrieval models. ACM Trans. Inf. Syst. 29, 2 (2011).
[25]
Thibault Formal, Carlos Lassance, Benjamin Piwowarski, and Stéphane Clinchant. 2021. SPLADE v2: Sparse lexical and expansion model for information retrieval. arXiv preprint arXiv:2109.10086 (2021).
[26]
Luyu Gao, Zhuyun Dai, and Jamie Callan. 2020. Understanding BERT rankers under distillation. In ICTIR.
[27]
Luyu Gao, Zhuyun Dai, and James P. Callan. 2020. EARL: Speedup transformer-based rankers with pre-computed representation. ArXiv abs/2004.13313 (2020).
[28]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM. ACM, 55–64.
[29]
William Hersh, Aaron Cohen, Lynn Ruslen, and Phoebe Roberts. 2007. TREC 2007 Genomics Track overview. In TREC.
[30]
William Hersh, Aaron M. Cohen, Phoebe Roberts, and Hari Krishna Rekapalli. 2006. TREC 2006 Genomics Track overview. In TREC.
[31]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015).
[32]
Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, and Allan Hanbury. 2020. Improving efficient neural ranking models with cross-architecture knowledge distillation. CoRR abs/2010.02666 (2020).
[33]
Sebastian Hofstätter, Aldo Lipani, Sophia Althammer, Markus Zlabinger, and Allan Hanbury. 2021. Mitigating the position bias of transformer models in passage re-ranking. In ECIR(Lecture Notes in Computer Science, Vol. 12656). Springer, 238–253.
[34]
Sebastian Hofstätter, Hamed Zamani, Bhaskar Mitra, Nick Craswell, and Allan Hanbury. 2020. Local self-attention over long text for efficient document retrieval. In SIGIR. ACM, 2021–2024.
[35]
Sebastian Hofstätter, Markus Zlabinger, and Allan Hanbury. 2019. TU Wien @ TREC deep learning’19—Simple contextualization for re-ranking. In TREC.
[36]
Sebastian Hofstätter, Markus Zlabinger, and Allan Hanbury. 2020. Interpretable & time-budget-constrained contextualization for re-ranking. CoRR abs/2002.01854 (2020).
[37]
Sebastian Hofstätter, Markus Zlabinger, Mete Sertkan, Michael Schröder, and Allan Hanbury. 2020. Fine-grained relevance annotations for multi-task document ranking and question answering. In CIKM. ACM, 3031–3038.
[38]
Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In NIPS. 2042–2050.
[39]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry P. Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM. ACM, 2333–2338.
[40]
Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2017. PACRR: A position-aware neural IR model for relevance matching. In EMNLP. Association for Computational Linguistics, 1049–1058.
[41]
Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2018. Co-PACRR: A context-aware neural IR model for ad-hoc retrieval. In WSDM. ACM, 279–287.
[42]
Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2020. Poly-encoders: Architectures and pre-training strategies for fast and accurate multi-sentence scoring. In ICLR. OpenReview.net.
[43]
Jyun-Yu Jiang, Chenyan Xiong, Chia-Jung Lee, and Wei Wang. 2020. Long document ranking with query-directed sparse transformer. In EMNLP (Findings). Association for Computational Linguistics, 4594–4605.
[44]
Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. TinyBERT: Distilling BERT for natural language understanding. CoRR abs/1909.10351 (2019).
[45]
Mostafa Keikha, Jae Hyun Park, and W. Bruce Croft. 2014. Evaluating answer passages using summarization measures. In SIGIR. 963–966.
[46]
Mostafa Keikha, Jae Hyun Park, W. Bruce Croft, and Mark Sanderson. 2014. Retrieving passages and finding answers. In ADCS. 81–84.
[47]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and effective passage search via contextualized late interaction over BERT. In SIGIR.
[48]
Youngwoo Kim, Razieh Rahimi, Hamed R. Bonab, and James Allan. 2021. Query-driven segment selection for ranking long documents. In CIKM. ACM, 3147–3151.
[49]
Julien Knafou, Matthew Jeffryes, Sohrab Ferdowsi, and Patrick Ruch. 2020. SIB text mining at TREC 2020 Deep Learning Track. In TREC(NIST Special Publication, Vol. 1266). National Institute of Standards and Technology (NIST).
[50]
Victor Lavrenko and W. Bruce Croft. 2001. Relevance-based language models. In SIGIR. ACM, 120–127.
[51]
Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2020), 1234–1240.
[52]
Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, and Jungang Xu. 2018. NPRF: A neural pseudo relevance feedback framework for ad-hoc information retrieval. In EMNLP. Association for Computational Linguistics, 4482–4491.
[53]
Canjia Li and Andrew Yates. 2020. MPII at the TREC 2020 Deep Learning Track. In TREC.
[54]
Canjia Li and Andrew Yates. 2020. MPII at the NTCIR-15 WWW-3 task. In NTCIR.
[55]
Minghan Li and Éric Gaussier. 2021. KeyBLD: Selecting key blocks with local pre-ranking for long document information retrieval. In SIGIR. ACM, 2207–2211.
[56]
Jimmy Lin and Xueguang Ma. 2021. A few brief notes on DeepImpact, COIL, and a conceptual framework for information retrieval techniques. arXiv preprint arXiv:2106.14807 (2021).
[57]
Jimmy Lin, Rodrigo Nogueira, and Andrew Yates. 2020. Pretrained transformers for text ranking: BERT and beyond. arXiv preprint arXiv:2010.06467 (2020).
[58]
Jimmy J. Lin. 2009. Is searching full text more effective than searching abstracts? BMC Bioinform. 10 (2009).
[59]
Xiaoyong Liu and W. Bruce Croft. 2002. Passage retrieval based on language models. In CIKM. ACM, 375–382.
[60]
Yang Liu and Mirella Lapata. 2019. Hierarchical transformers for multi-document summarization. In ACL. 5070–5081.
[61]
Zhiyuan Liu, Yankai Lin, and Maosong Sun. 2020. Representation Learning for Natural Language Processing. Springer.
[62]
Yi Luan, Jacob Eisenstein, Kristina Toutanova, and Michael Collins. 2021. Sparse, dense, and attentional representations for text retrieval. Trans. Assoc. Computat. Ling. 9 (042021), 329–345. DOI:
[63]
Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2021. PROP: Pre-training with representative words prediction for ad-hoc retrieval. In WSDM. ACM, 283–291.
[64]
Xinyu Ma, Jiafeng Guo, Ruqing Zhang, Yixing Fan, Yingyan Li, and Xueqi Cheng. 2021. B-PROP: Bootstrapped pre-training with representative words prediction for ad-hoc retrieval. In SIGIR. ACM, 1318–1327.
[65]
Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, and Ophir Frieder. 2020. Efficient document re-ranking for transformers by precomputing term representations. In SIGIR.
[66]
Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, and Ophir Frieder. 2020. Expansion via prediction of importance with contextualization. In SIGIR.
[67]
Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian. 2019. CEDR: Contextualized embeddings for document ranking. In SIGIR. ACM, 1101–1104.
[68]
Antonio Mallia, Omar Khattab, Torsten Suel, and Nicola Tonellotto. 2021. Learning passage impacts for inverted indexes. In SIGIR. ACM, 1723–1727.
[69]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.
[70]
Bhaskar Mitra and Nick Craswell. 2019. Duet at TREC 2019 Deep Learning Track. In TREC(NIST Special Publication, Vol. 1250). National Institute of Standards and Technology (NIST).
[71]
Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to match using local and distributed representations of text for web search. In WWW. ACM, 1291–1299.
[72]
Bhaskar Mitra, Sebastian Hofstätter, Hamed Zamani, and Nick Craswell. 2020. Conformer-kernel with query term independence for document retrieval. CoRR abs/2007.10434 (2020).
[73]
Thong Nguyen, Sean MacAvaney, and Andrew Yates. 2023. A unified framework for learned sparse retrieval. In Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part III. Springer, 101–116.
[74]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated MAchine Reading COmprehension dataset. In CoCo@NIPS(CEUR Workshop Proceedings, Vol. 1773). CEUR-WS.org.
[75]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with BERT. CoRR abs/1901.04085 (2019).
[76]
Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document ranking with a pretrained sequence-to-sequence model. In Findings of EMNLP.
[77]
Rodrigo Nogueira, Wei Yang, Jimmy Lin, and Kyunghyun Cho. 2019. Document expansion by query prediction. CoRR abs/1904.08375 (2019).
[78]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In AAAI. AAAI Press, 2793–2799.
[79]
Ronak Pradeep, Xueguang Ma, Xinyu Zhang, Hang Cui, Ruizhou Xu, Rodrigo Frassetto Nogueira, and Jimmy Lin. 2020. H2oloo at TREC 2020: When all you got is a hammer... deep learning, health misinformation, and precision medicine. In TREC(NIST Special Publication, Vol. 1266). National Institute of Standards and Technology (NIST).
[80]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019).
[81]
Fiana Raiber and Oren Kurland. 2020. The Technion at the WWW-3 task: Cluster-based document retrieval. In NTCIR. 247–248.
[82]
Stephen E. Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In SIGIR. ACM/Springer, 232–241.
[83]
Stephen E. Robertson, Steve Walker, Micheline Hancock-Beaulieu, Mike Gatford, and A. Payne. 1995. Okapi at TREC-4. In TREC.
[84]
J. J. Rocchio. 1971. Relevance feedback in information retrieval. In The Smart Retrieval System - Experiments in Automatic Document Processing, G. Salton (Ed.). Englewood Cliffs, NJ: Prentice-Hall, 313–323.
[85]
Tetsuya Sakai, Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Topic set size design with the evaluation measures for short text conversation. In AIRS(Lecture Notes in Computer Science, Vol. 9460). Springer, 319–331.
[86]
Tetsuya Sakai, Sijie Tao, Zhaohao Zeng, Yukun Zheng, Jiaxin Mao, Zhumin Chu, Yiqun Liu, Maria Maistro, Zhicheng Dou, Nicola Ferro, and Ian Soboroff. 2020. Overview of the NTCIR-15 We want web with CENTRE (WWW-3) task. In NTCIR.
[87]
Dominik Scherer, Andreas C. Müller, and Sven Behnke. 2010. Evaluation of pooling operations in convolutional architectures for object recognition. In ICANN(Lecture Notes in Computer Science, Vol. 6354). Springer, 92–101.
[88]
Eilon Sheetrit, Anna Shtok, and Oren Kurland. 2020. A passage-based approach to learning to rank documents. Inf. Retr. J. 23, 2 (2020), 159–186.
[89]
Kohei Shinden, Atsuki Maruta, and Makoto P. Kato. 2020. KASYS at the NTCIR-15 WWW-3 task. In NTCIR. 235–238.
[90]
Siqi Sun, Yu Cheng, Zhe Gan, and Jingjing Liu. 2019. Patient knowledge distillation for BERT model compression. In EMNLP.
[91]
Amir Vakili Tahami, Kamyar Ghajar, and Azadeh Shakery. 2020. Distilling knowledge for fast retrieval-based chat-bots. CoRR abs/2004.11045 (2020).
[92]
Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova, and Jimmy Lin. 2019. Distilling task-specific knowledge from BERT into simple neural networks. CoRR abs/1903.12136 (2019).
[93]
Iulia Turc, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Well-read students learn better: The impact of student initialization on knowledge distillation. CoRR abs/1908.08962 (2019).
[94]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. 5998–6008.
[95]
Wei Wang, Bin Bi, Ming Yan, Chen Wu, Jiangnan Xia, Zuyi Bao, Liwei Peng, and Luo Si. 2020. StructBERT: Incorporating language structures into pre-training for deep language understanding. In ICLR. OpenReview.net.
[96]
Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2021. Hi-transformer: Hierarchical interactive transformer for efficient and effective long document modeling. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, 848–853. DOI:
[97]
Zhijing Wu, Jiaxin Mao, Yiqun Liu, Jingtao Zhan, Yukun Zheng, Min Zhang, and Shaoping Ma. 2020. Leveraging passage-level cumulative gain for document ranking. In WWW. ACM/IW3C2, 2421–2431.
[98]
Zhijing Wu, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2019. Investigating passage-level relevance and its role in document-level relevance judgment. In SIGIR. ACM, 605–614.
[99]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In SIGIR. ACM, 55–64.
[100]
Ming Yan, Chenliang Li, Chen Wu, Bin Bi, Wei Wang, Jiangnan Xia, and Luo Si. 2019. IDST at TREC 2019 Deep Learning Track: Deep cascade ranking with generation-based document expansion and pre-trained language modeling. In TREC.
[101]
Liu Yang, Mingyang Zhang, Cheng Li, Michael Bendersky, and Marc Najork. 2020. Beyond 512 tokens: Siamese multi-depth transformer-based hierarchical encoder for long-form document matching. In CIKM. ACM, 1725–1734.
[102]
Peilin Yang, Hui Fang, and Jimmy Lin. 2018. Anserini: Reproducible ranking baselines using Lucene. J. Data Inf. Qual. 10, 4 (2018), 16:1–16:20.
[103]
Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. 2019. Critically examining the “neural hype”: Weak baselines and the additivity of effectiveness gains from neural ranking models. In SIGIR. ACM, 1129–1132.
[104]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In NAACL-HLT. 1480–1489.
[105]
Andrew Yates, Kevin Martin Jose, Xinyu Zhang, and Jimmy Lin. 2020. Flexible IR pipelines with Capreolus. In CIKM. 3181–3188.
[106]
Zeynep Akkalyoncu Yilmaz, Shengjin Wang, Wei Yang, Haotian Zhang, and Jimmy Lin. 2019. Applying BERT to document retrieval with Birch. In EMNLP.
[107]
Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontañón, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. 2020. BigBird: Transformers for longer sequences. In NeurIPS.
[108]
Hamed Zamani, Bhaskar Mitra, Xia Song, Nick Craswell, and Saurabh Tiwary. 2018. Neural ranking models with multiple document fields. In WSDM. ACM, 700–708.
[109]
Xingxing Zhang, Furu Wei, and Ming Zhou. 2019. HIBERT: Document level pre-training of hierarchical bidirectional transformers for document summarization. In ACL. 5059–5069.
[110]
Xinyu Zhang, Andrew Yates, and Jimmy Lin. 2021. Comparing score aggregation approaches for document retrieval with pretrained transformers. In Advances in Information Retrieval: 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28 – April 1, 2021, Proceedings, Part II. Springer-Verlag, Berlin, 150–163. DOI:
[111]
Chen Zhao, Chenyan Xiong, Corby Rosset, Xia Song, Paul N. Bennett, and Saurabh Tiwary. 2020. Transformer-XH: Multi-evidence reasoning with eXtra Hop attention. In ICLR. OpenReview.net.
[112]
Tiancheng Zhao, Xiaopeng Lu, and Kyusong Lee. 2020. Sparta: Efficient open-domain question answering via sparse transformer matching retrieval. arXiv preprint arXiv:2009.13013 (2020).
[113]
Jie Zhou, Xu Han, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2019. GEAR: Graph-based evidence aggregating and reasoning for fact verification. In ACL. Association for Computational Linguistics, 892–901.
[114]
Shengyao Zhuang and Guido Zuccon. 2021. TILDE: Term Independent Likelihood moDEl for passage re-ranking. In SIGIR. 1483–1492.

Cited By

View all
  • (2024)Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and ValidationJMIR AI10.2196/426303(e42630)Online publication date: 2-May-2024
  • (2024)Data Augmentation for Sample Efficient and Robust Document RankingACM Transactions on Information Systems10.1145/363491142:5(1-29)Online publication date: 29-Apr-2024
  • (2024)Efficient Neural Ranking Using Forward Indexes and Lightweight EncodersACM Transactions on Information Systems10.1145/363193942:5(1-34)Online publication date: 29-Apr-2024
  • Show More Cited By

Index Terms

  1. PARADE: Passage Representation Aggregation forDocument Reranking

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems  Volume 42, Issue 2
      March 2024
      897 pages
      EISSN:1558-2868
      DOI:10.1145/3618075
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 September 2023
      Online AM: 26 May 2023
      Accepted: 10 May 2023
      Revised: 16 March 2023
      Received: 15 December 2021
      Published in TOIS Volume 42, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Document reranking
      2. passage representation aggregation
      3. pre-trained language models

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Google Cloud
      • Google TPU Research Cloud (TRC)

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)581
      • Downloads (Last 6 weeks)69
      Reflects downloads up to 11 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Online Health Search Via Multidimensional Information Quality Assessment Based on Deep Language Models: Algorithm Development and ValidationJMIR AI10.2196/426303(e42630)Online publication date: 2-May-2024
      • (2024)Data Augmentation for Sample Efficient and Robust Document RankingACM Transactions on Information Systems10.1145/363491142:5(1-29)Online publication date: 29-Apr-2024
      • (2024)Efficient Neural Ranking Using Forward Indexes and Lightweight EncodersACM Transactions on Information Systems10.1145/363193942:5(1-34)Online publication date: 29-Apr-2024
      • (2024)CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657851(1221-1231)Online publication date: 10-Jul-2024
      • (2024)Neural natural language processing for long texts: A survey on classification and summarizationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108231133(108231)Online publication date: Jul-2024
      • (2022)A Context-Aware BERT Retrieval Framework Utilizing Abstractive Summarization2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00142(873-878)Online publication date: Nov-2022
      • (2022)A Multi-Dimensional Semantic Pseudo-Relevance Feedback Information Retrieval Model2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00141(866-872)Online publication date: Nov-2022

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media