Abstract
The Question Answering track at CLEF ran for 13 years, from 2003 until 2015. Along these years, many different tasks, resources and evaluation methodologies were developed. We divide the CLEF Question Answering campaigns into four eras: (1) Ungrouped mainly factoid questions asked against monolingual newspapers (2003–2006), (2) Grouped questions asked against newspapers and Wikipedias (2007–2008), (3) Ungrouped questions against multilingual parallel-aligned EU legislative documents (2009–2010), and (4) Questions about a single document using a related document collection as background information (2011–2015). We provide the description and the main results for each of these eras, together with the pilot exercises and other Question Answering tasks that ran in CLEF. Finally, we conclude with some of the lessons learnt along these years.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cassan A, Figueira H, Martins A, Mendes A, Mendes P, Pinto C, Vidal D (2007) Priberam’s question answering system in a cross-language environment. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 300–309
Clark P, Etzioni O (2016) My computer is an honor student - but how intelligent is it? Standardized tests as a measure of AI. AI Mag 37(1):5–12
Ferrucci DA, Brown EW, Chu-Carroll J, Fan J, Gondek D, Kalyanpur A, Lally A, Murdock JW, Nyberg E, Prager JM, Schlaefer N, Welty CA (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79
Forner P, Peñas A, Agirre E, Alegria I, Forascu C, Moreau N, Osenova P, Prokopidis P, Rocha P, Sacaleanu B, Sutcliffe RFE, Sang EFTK (2009) Overview of the Clef 2008 multilingual question answering track. In: Peters C, Deselaers T, Ferro N, Gonzalo J, Jones GJF, Kurimo M, Mandl T, Peñas A (eds) Evaluating systems for multilingual and multimodal information access: ninth workshop of the cross–language evaluation forum (CLEF 2008). Revised selected papers. Lecture notes in computer science (LNCS) 5706. Springer, Heidelberg, pp 262–295
Giampiccolo D, Forner P, Herrera J, Peñas A, Ayache C, Forascu C, Jijkoun V, Osenova P, Rocha P, Sacaleanu B, Sutcliffe RFE (2008) Overview of the CLEF 2007 multilingual question answering track. In: Peters C, Jijkoun V, Mandl T, Müller H, Oard DW, Peñas A, Petras V, Santos D (eds) Advances in multilingual and multimodal information retrieval: eighth workshop of the cross–language evaluation forum (CLEF 2007). Revised selected papers. Lecture notes in computer science (LNCS) 5152. Springer, Heidelberg, pp 200–236
Harabagiu S, Lacatusu F, Hickl A (2006) Answering complex questions with random walk models. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’06, pp 220–227
Herrera J, Peñas A, Verdejo F (2005) Question answering pilot task at CLEF 2004. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: fifth workshop of the cross–language evaluation forum (CLEF 2004) revised selected papers. Lecture notes in computer science (LNCS) 3491. Springer, Heidelberg, pp 581–590
Jijkoun V, de Rijke M (2007) Overview of the wiqa task at CLEF 2006. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 265–274
Lamel L, Rosset S, Ayache C, Mostefa D, Turmo J, Comas P (2008) Question answering on speech transcriptions: the QAST evaluation in CLEF. In: Proceedings of the international conference on language resources and evaluation, LREC 2008, 26 May–1 June 2008, Marrakech, Morocco
Laurent D (2014) English run of synapse développement at entrance exams 2014. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) CLEF 2014 labs and workshops, notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1180/, pp 1404–1414
Laurent D, Séguéla P, Nègre S (2007) Cross lingual question answering using QRISTAL for CLEF 2006. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 339–350
Laurent D, Chardon B, Nègre S (2014) French run of synapse développement at entrance exams 2014. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) CLEF 2014 labs and workshops, notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1180/, pp 1415–1426
Laurent D, Chardon B, Nègre S, Pradel C, Séguéla P (2015) Reading comprehension at entrance exams 2015. In: Cappellato L, Ferro N, Jones GJF, SanJuan E (eds) CLEF 2015 labs and workshops, notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1391/
Lopez V, Unger C, Cimiano P, Motta E (2013) Evaluating question answering over linked data. Web Semantics: Sci Serv Agents World Wide Web 21(0):3–13. Special Issue on Evaluation of Semantic Technologies
Magnini B, Romagnoli S, Vallin A, Herrera J, Peñas A, Peinado V, Verdejo MF, de Rijke M (2004) The multiple language question answering track at CLEF 2003. In: Peters C, Braschler M, Gonzalo J, Kluck M (eds) Comparative evaluation of multilingual information access systems: fourth workshop of the cross–language evaluation forum (CLEF 2003) revised selected papers. Lecture notes in computer science (LNCS) 3237. Springer, Heidelberg, pp 471–486
Magnini B, Vallin A, Ayache C, Erbach G, Peñas A, de Rijke M, Rocha P, Simov KI, Sutcliffe RFE (2005) Overview of the CLEF 2004 multilingual question answering track. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: fifth workshop of the cross–language evaluation forum (CLEF 2004) Revised selected papers. Lecture notes in computer science (LNCS) 3491. Springer, Heidelberg, pp 371–391
Magnini B, Giampiccolo D, Forner P, Ayache C, Jijkoun V, Osenova P, Peñas A, Rocha P, Sacaleanu B, Sutcliffe RFE (2007) Overview of the CLEF 2006 multilingual question answering track. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 223–256
Montes-y-Gómez M, Pineda LV, Pérez-Coutiño MA, Soriano JMG, Arnal ES, Rosso P (2006) A full data-driven system for multiple language question answering. In: Peters C, Gey FC, Gonzalo J, Jones GJF, Kluck M, Magnini B, Müller H, de Rijke M (eds) Accessing multilingual information repositories: sixth workshop of the cross–language evaluation forum (CLEF 2005). Revised selected papers. Lecture notes in computer science (LNCS) 4022. Springer, Heidelberg, pp 420–428
Morante R, Daelemans W (2011) Overview of the QA4MRE pilot task: annotating modality and negation for a machine reading evaluation. In: Petras V, Forner P, Clough P, Ferro N (eds) CLEF 2011 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
Morante R, Krallinger M, Valencia A, Daelemans W (2013) Machine reading of biomedical texts about alzheimer’s disease. In: Forner P, Navigli R, Tufis D, Ferro N (eds) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1179/
Noguera E, Llopis F, Ferrández A, Escapa A (2007) Evaluation of open-domain question answering systems within a time constraint. In: 21st International conference on advanced information networking and applications (AINA 2007). Workshops proceedings, vol 1, May 21–23, 2007, Niagara Falls, Canada, pp 260–265
Peñas A, Rodrigo A (2011) A simple measure to assess non-response. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies - vol 1. Association for Computational Linguistics, HLT ’11, pp 1415–1424
Peñas A, Rodrigo Á, Sama V, Verdejo F (2007) Overview of the answer validation exercise 2006. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 257–264
Peñas A, Forner P, Rodrigo A, Sutcliffe RFE, Forascu C, Mota C (2010a) Overview of ResPubliQA 2010: question answering evaluation over European legislation. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073, http://ceur-ws.org/Vol-1176/
Peñas A, Forner P, Sutcliffe RFE, Rodrigo A, Forascu C, Alegria I, Giampiccolo D, Moreau N, Osenova P (2010b) Overview of ResPubliQA 2009: question answering evaluation over European legislation. In: Peters C, Di Nunzio GM, Kurimo M, Mandl T, Mostefa D, Peñas A, Roda G (eds) Multilingual information access evaluation vol. I text retrieval experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS) 6241. Springer, Heidelberg, pp 174–196
Peñas A, Hovy EH, Forner P, Rodrigo A, Sutcliffe RFE, Forascu C, Sporleder C (2011) Overview of QA4MRE at CLEF 2011: question answering for machine reading evaluation. In: Petras V, Forner P, Clough P, Ferro N (eds) CLEF 2011 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
Peñas A, Miyao Y, Hovy E, Forner P, Kando N (2013) Overview of QA4MRE 2013 entrance exams task. In: Forner P, Navigli R, Tufis D, Ferro N (eds) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1179/
Santos D, Cabral LM (2010) GikiCLEF: expectations and lessons learned. In: Peters C, Di Nunzio GM, Kurimo M, Mandl T, Mostefa D, Peñas A, Roda G (eds) Multilingual information access evaluation vol. I text retrieval experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS) 6241, Springer, Heidelberg, pp 212–222
Tsatsaronis G, Balikas G, Malakasiotis P, Partalas I, Zschunke M, Alvers MR, Weissenborn D, Krithara A, Petridis S, Polychronopoulos D, Almirantis Y, Pavlopoulos J, Baskiotis N, Gallinari P, Artières T, Ngonga A, Heino N, Gaussier É, Barrio-Alvers L, Schroeder M, Androutsopoulos I, Paliouras G (2015) An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinf 16:138:1–138:28
Vallin A, Magnini B, Giampiccolo D, Aunimo L, Ayache C, Osenova P, Peñas A, de Rijke M, Sacaleanu B, Santos D, Sutcliffe RFE (2006) Overview of the CLEF 2005 multilingual question answering track. In: Peters C, Gey FC, Gonzalo J, Jones GJF, Kluck M, Magnini B, Müller H, de Rijke M (eds) Accessing multilingual information repositories: sixth workshop of the cross–language evaluation forum (CLEF 2005). Revised selected papers. Lecture notes in computer science (LNCS) 4022. Springer, Heidelberg, pp 307–331
Voorhees EM (2000) Overview of the TREC-9 question answering track. In: Proceedings of the ninth text retrieval conference, TREC 2000, Gaithersburg, Maryland, USA, November 13–16, 2000
Voorhees EM (2002) Overview of TREC 2002 question answering track. In: Voorhees EM, Buckland LP (eds) Proceedings of the eleventh text retrieval conference (TREC 2002). NIST Publication 500-251, pp 57–68
Voorhees EM, Tice DM (1999) The TREC-8 question answering track evaluation. In: Text retrieval conference TREC-8, pp 83–105
Acknowledgements
This work has been partially funded by the Spanish Research Agency (Agencia Estatal de Investigación) LIHLITH project (PCIN-2017-085/AEI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Peñas, A. et al. (2019). Results and Lessons of the Question Answering Track at CLEF. In: Ferro, N., Peters, C. (eds) Information Retrieval Evaluation in a Changing World. The Information Retrieval Series, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-030-22948-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-22948-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22947-4
Online ISBN: 978-3-030-22948-1
eBook Packages: Computer ScienceComputer Science (R0)