More Web Proxy on the site http://driver.im/

research-article

Open access

A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning

Authors:

Maarten de Rijke,

Xueqi ChengAuthors Info & Claims

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 1448 - 1457

https://doi.org/10.1145/3539618.3591631

Published: 18 July 2023 Publication History

Abstract

Knowledge-intensive language tasks (KILTs) benefit from retrieving high-quality relevant contexts from large external knowledge corpora. Learning task-specific retrievers that return relevant contexts at an appropriate level of semantic granularity, such as a document retriever, passage retriever, sentence retriever, and entity retriever, may help to achieve better performance on the end-to-end task. But a task-specific retriever usually has poor generalization ability to new domains and tasks, and it may be costly to deploy a variety of specialised retrievers in practice.

We propose a unified generative retriever (UGR) that combines task-specific effectiveness with robust performance over different retrieval tasks in KILTs. To achieve this goal, we make two major contributions: (i) To unify different retrieval tasks into a single generative form, we introduce an n-gram-based identifier for relevant contexts at different levels of granularity in KILTs. And (ii) to address different retrieval tasks with a single model, we employ a prompt learning strategy and investigate three methods to design prompt tokens for each task. In this way, the proposed UGR model can not only share common knowledge across tasks for better generalization, but also perform different retrieval tasks effectively by distinguishing task-specific characteristics. We train UGR on a heterogeneous set of retrieval corpora with well-designed prompts in a supervised and multi-task fashion. Experimental results on the KILT benchmark demonstrate the effectiveness of UGR on in-domain datasets, out-of-domain datasets, and unseen tasks.

References

[1]

Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. 2019. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP. In NAACL 2019. 54--59.

[2]

Michele Bevilacqua, Giuseppe Ottaviano, Patrick Lewis, Wen tau Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive Search Engines: Generating Substrings as Document Identifiers. In arXiv pre-print 2204.10629.

[3]

Michael Burrows and David Wheeler. 1994. A Block-sorting Lossless Data Compression Algorithm. In Digital SRC Research Report. Citeseer.

[4]

Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to Answer Open-domain Questions. In 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017. 1870--1879.

[5]

Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2022a. GERE: Generative evidence retrieval for fact verification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2184--2189.

Digital Library

[6]

Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Yiqun Liu, Yixing Fan, and Xueqi Cheng. 2022b. CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks. In CIKM. 191--200.

[7]

Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th international conference on Machine learning. 160--167.

Digital Library

[8]

Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2020. Autoregressive Entity Retrieval. In International Conference on Learning Representations.

[9]

Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. 2018. Wizard of Wikipedia: Knowledge-Powered Conversational Agents. In International Conference on Learning Representations.

[10]

Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Frederique Laforest, and Elena Simperl. 2018. T-rex: A Large Scale Alignment of Natural Language with Knowledge base triples. In LREC 2019.

[11]

Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, and Michael Auli. 2019. ELI5: Long Form Question Answering. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3558--3567.

[12]

Paolo Ferragina and Giovanni Manzini. 2000. Opportunistic Data Structures with Applications. In Proceedings 41st annual symposium on foundations of computer science. IEEE, 390--398.

[13]

Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines, Vol. 30, 4 (2020), 681--694.

Digital Library

[14]

Michael Glass, Gaetano Rossiello, Md Faisal Mahbub Chowdhury, and Alfio Gliozzo. 2021. Robust Retrieval Augmented Generation for Zero-shot Slot Filling. In EMNLP 2021. 1939--1949.

[15]

Yuxian Gu, Xu Han, Zhiyuan Liu, and Minlie Huang. 2022. PPT: Pre-trained Prompt Tuning for Few-shot Learning. In ACL.

[16]

Zhaochen Guo and Denilson Barbosa. 2018. Robust Named Entity Disambiguation with Random Walks. Semantic Web, Vol. 9, 4 (2018), 459--479.

Digital Library

[17]

Han He and Jinho D Choi. 2021. The Stem Cell Hypothesis: Dilemma Behind Multi-task Learning with Transformer Encoders. arXiv preprint arXiv:2109.06939 (2021).

[18]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780.

Digital Library

[19]

Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 conference on empirical methods in natural language processing. 782--792.

Digital Library

[20]

Gautier Izacard and Édouard Grave. 2021. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. In EACL. 874--880.

[21]

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Yejin Bang, Andrea Madotto, and Pascale Fung. 2022. Survey of Hallucination in Natural Language Generation. arXiv preprint arXiv:2202.03629 (2022).

[22]

Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In ACL.

[23]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In EMNLP 2020. 6769--6781.

[24]

Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.

[25]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).

[26]

Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: A Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 453--466.

[27]

Omer Levy, Minjoon Seo, Eunsol Choi, and Luke Zettlemoyer. 2017. Zero-Shot Relation Extraction via Reading Comprehension. In CoNLL 2017. 333--342.

[28]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020a. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL. 7871--7880.

[29]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2020b. Retrieval-augmented Generation for Knowledge-intensive NLP Tasks. NeurIPS 2020, Vol. 33 (2020), 9459--9474.

[30]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021a. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. arXiv preprint arXiv:2107.13586 (2021).

[31]

Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, and Jie Tang. 2021b. GPT Understands, Too. arXiv preprint arXiv:2103.10385 (2021).

[32]

Jean Maillard, Vladimir Karpukhin, Fabio Petroni, Wen-tau Yih, Barlas Oguz, Veselin Stoyanov, and Gargi Ghosh. 2021. Multi-Task Retrieval for Knowledge-Intensive Tasks. In ACL 2021. 1098--1111.

[33]

Donald Metzler, Yi Tay, Dara Bahri, and Marc Najork. 2021. Rethinking Search: Making Domain Experts Out of Dilettantes. In ACM SIGIR Forum, Vol. 55. ACM New York, NY, USA, 1--27.

Digital Library

[34]

Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rocktäschel, and Sebastian Riedel. 2021. KILT: a Benchmark for Knowledge Intensive Language Tasks. In NAACL 2021. Association for Computational Linguistics, Online, 2523--2544.

[35]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, Vol. 21 (2020), 1--67.

[36]

Stephen Robertson and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, Vol. 3 (2009), 333--389. Issue 4.

Digital Library

[37]

Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M. Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal V. Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Févry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, and Alexander M. Rush. 2022. Multitask Prompted Training Enables Zero-Shot Task Generalization. In ICLR.

[38]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to Sequence Learning with Neural Networks. Advances in neural information processing systems, Vol. 27 (2014).

[39]

Alon Talmor and Jonathan Berant. 2019. MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension. arXiv preprint arXiv:1905.13453 (2019).

[40]

Yi Tay, Vinh Q Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, and Donald Metzler. 2022. Transformer Memory as a Differentiable Search Index. arXiv preprint arXiv:2202.06991 (2022).

[41]

James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: A Large-scale Dataset for Fact Extraction and VERification. In NAACL: Human Language Technologies, Volume 1 (Long Papers). 809--819.

[42]

Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Allen Sun, Weiwei Deng, Qi Zhang, and Mao Yang. 2022. A Neural Corpus Indexer for Document Retrieval. arXiv preprint arXiv:2206.02743 (2022).

[43]

Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le. 2021. Finetuned Language Models are Zero-shot Learners. arXiv preprint arXiv:2109.01652 (2021).

[44]

Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In EMNLP 2020. 6397--6407.

[45]

Shicheng Xu, Liang Pang, Huawei Shen, and Xueqi Cheng. 2022. Improving Multi-task Generalization Ability for Neural Text Matching via Prompt Learning. arXiv preprint arXiv:2204.02725 (2022).

[46]

Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. 2018. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. In EMNLP. Association for Computational Linguistics, Brussels, Belgium, 2369--2380.

[47]

Yujia Zhou, Jing Yao, Zhicheng Dou, Ledell Wu, and Ji-Rong Wen. 2022. DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index. arXiv preprint arXiv:2203.00537 (2022).

Cited By

Li RWang YWen ZCui MMiao Q(2025)Different paths to the same destinationKnowledge-Based Systems10.1016/j.knosys.2024.112789309:COnline publication date: 18-Feb-2025
https://dl.acm.org/doi/10.1016/j.knosys.2024.112789
Azzopardi LClarke CKantor PMitra BTrippas JRen ZAliannejadi MArabzadeh NChandrasekar Rde Rijke MEustratiadis PHersh WHuang JKanoulas EKareem JLi YLupart SMekonnen KRoegiest ASoboroff ISilvestri FVerberne SVos DYang EZhao Y(2024)Report on the Search Futures Workshop at ECIR 2024ACM SIGIR Forum10.1145/3687273.368728858:1(1-41)Online publication date: 7-Aug-2024
https://dl.acm.org/doi/10.1145/3687273.3687288
Tang YZhang RGuo Jde Rijke MChen WCheng X(2024)Listwise Generative Retrieval Models via a Sequential Learning ProcessACM Transactions on Information Systems10.1145/365371242:5(1-31)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3653712
Show More Cited By

Index Terms

A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Knowledge-intensive language tasks (KILT) usually require a large body of information to provide correct answers. A popular paradigm to solve this problem is to combine a search system with a machine reader, where the former retrieves supporting ...
CorpusLM: Towards a Unified Language Model on Corpus for Knowledge-Intensive Tasks
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Large language models (LLMs) have gained significant attention in various fields but prone to hallucination, especially in knowledge-intensive (KI) tasks. To address this, retrieval-augmented generation (RAG) has emerged as a popular solution to enhance ...
Generative Retrieval as Multi-Vector Dense Retrieval
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

For a given query generative retrieval generates identifiers of relevant documents in an end-to-end manner using a sequence-to-sequence architecture. The relation between generative retrieval and other retrieval methods, especially those based on ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China (NSFC)
Lenovo-CAS Joint Lab Youth Scientist Project
Youth Innovation Promotion Association CAS
Young Elite Scientist Sponsorship Program by CAST
Hybrid Intelligence Center
China Scholarship Council

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
742
Total Downloads

Downloads (Last 12 months)464
Downloads (Last 6 weeks)32

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li RWang YWen ZCui MMiao Q(2025)Different paths to the same destinationKnowledge-Based Systems10.1016/j.knosys.2024.112789309:COnline publication date: 18-Feb-2025
https://dl.acm.org/doi/10.1016/j.knosys.2024.112789
Azzopardi LClarke CKantor PMitra BTrippas JRen ZAliannejadi MArabzadeh NChandrasekar Rde Rijke MEustratiadis PHersh WHuang JKanoulas EKareem JLi YLupart SMekonnen KRoegiest ASoboroff ISilvestri FVerberne SVos DYang EZhao Y(2024)Report on the Search Futures Workshop at ECIR 2024ACM SIGIR Forum10.1145/3687273.368728858:1(1-41)Online publication date: 7-Aug-2024
https://dl.acm.org/doi/10.1145/3687273.3687288
Tang YZhang RGuo Jde Rijke MChen WCheng X(2024)Listwise Generative Retrieval Models via a Sequential Learning ProcessACM Transactions on Information Systems10.1145/365371242:5(1-31)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3653712
Tang YZhang RRen ZGuo Jde Rijke MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Recent Advances in Generative Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661379(3005-3008)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3661379
Dai LYin YChen EXiong HHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Unifying Graph Retrieval and Prompt Tuning for Graph-Grounded Text ClassificationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657934(2682-2686)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657934
Zhang PLiu ZZhou YDou ZLiu FCao ZHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Generative Retrieval via Term Set GenerationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657797(458-468)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657797
Li XDou ZZhou YLiu FHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)CorpusLM: Towards a Unified Language Model on Corpus for Knowledge-Intensive TasksProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657778(26-37)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657778
Zeng HLuo CZamani HHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Planning Ahead in Generative Retrieval: Guiding Autoregressive Generation through Simultaneous DecodingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657746(469-480)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657746
Yue LLiu QZhao LWang LGao WAn YHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Event Grounded Criminal Court View Generation with Cooperative (Large) Language ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657698(2221-2230)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657698
Tang YZhang RSun WGuo JDe Rijke MChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Recent Advances in Generative Information RetrievalCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641239(1238-1241)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3641239
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten