CORAL: Benchmarking Conversational Retrieval-Augmentation Generation We present a large-scale conversational RAG benchmark named CORAL and propose a unified framework for standardizing and evaluating various conversational RAG baselines. CORAL: CORAL has five critical features: open-domain coverage, knowledge-intensiveness, freeform response generation, handling of topic shifts, and citation labeling. In CORAL, we evaluate conversational RAG systems across three essential tasks: (1) Conversational Passage Retrieval: assessing the system’s ability to retrieve relevant information from a large document set based on multi-turn context; (2) Response Generation: evaluating the system’s capacity to generate accurate, contextually rich answers; (3) Citation Labeling: ensuring that the generated responses are transparent and grounded by requiring correct attribution of sources. Conversational RAG Framework: We develop a unified framework for standardizing and evaluating various conversational RAG baselines, facilitating systematic comparison and advancement in this rapidly evolving field. 🪸 CORAL 🌠 Overview of Constructing Dataset Process 🌈 Four Different Conversation Flow Sampling 🎯 Data statistics LDS SIDS STRW DTRW Train Test Train Test Train Test Train Test # Conversation 1800 200 1800 200 1800 200 1800 200 # Turns 5934 651 16082 1727 18165 1949 19411 2153 # Turns / Conversation 3.30 3.26 8.93 8.64 10.09 9.75 10.78 10.77 # Tokens / Question 13.70 13.89 12.62 12.64 12.72 12.88 14.15 14.75 # Tokens / Response 233.81 147.16 242.54 155.54 243.34 191.60 300.47 259.72 # Positive passages/ Turn 3.25 2.03 2.64 1.73 3.01 2.12 3.98 3.50 Dataset Format CORAL includes 8,000 conversations in jsonline format. Each line in either the train_conversation.json or test_conversation.json file follows this structure: { "conv_id": "Train_type_convid", "turns": [ { "turn_id": 1, "question": "", "response": "", "golden_rewrite": "", "golden_docs_pids": [], "golden_docs_text": [] }, { "turn_id": 2, "question": "", "response": "", "golden_rewrite": "", "golden_docs_pids": [], "golden_docs_text": [] }, ... } 🔥 Conversational RAG Framework 🚀 QuickStart git lfs clone https://huggingface.co/datasets/ariya2357/CORAL 🔖 License Our code is licensed under the MIT License. Our dataset is distributed under the CC BY-SA-4.0 license. 🌟 Citation Please kindly cite our paper if helps your research: @article{coral, author = {Yiruo Cheng and Kelong Mao and Ziliang Zhao and Guanting Dong and Hongjin Qian and Yongkang Wu and Tetsuya Sakai and Ji{-}Rong Wen and Zhicheng Dou}, title = {{CORAL:} Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation}, journal = {CoRR}, volume = {abs/2410.23090}, year = {2024}, url = {https://doi.org/10.48550/arXiv.2410.23090}, doi = {10.48550/ARXIV.2410.23090}, eprinttype = {arXiv}, eprint = {2410.23090}, timestamp = {Fri, 29 Nov 2024 21:16:27 +0100}, biburl = {https://dblp.org/rec/journals/corr/abs-2410-23090.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }