Results and Lessons of the Question Answering Track at CLEF

Anselmo Peñas⁹,
Álvaro Rodrigo⁹,
Bernardo Magnini¹⁰,
Pamela Forner¹¹,
Eduard Hovy¹²,
Richard Sutcliffe¹³ &
…
Danilo Giampiccolo¹¹

Part of the book series: The Information Retrieval Series ((INRE,volume 41))

724 Accesses
2 Citations

Abstract

The Question Answering track at CLEF ran for 13 years, from 2003 until 2015. Along these years, many different tasks, resources and evaluation methodologies were developed. We divide the CLEF Question Answering campaigns into four eras: (1) Ungrouped mainly factoid questions asked against monolingual newspapers (2003–2006), (2) Grouped questions asked against newspapers and Wikipedias (2007–2008), (3) Ungrouped questions against multilingual parallel-aligned EU legislative documents (2009–2010), and (4) Questions about a single document using a related document collection as background information (2011–2015). We provide the description and the main results for each of these eras, together with the pilot exercises and other Question Answering tasks that ran in CLEF. Finally, we conclude with some of the lessons learnt along these years.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 79.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 99.99; Price includes VAT (United Kingdom)

Hardcover Book: GBP 139.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Final Report of the NTCIR-14 QA Lab-PoliInfo Task

RuBQ: A Russian Dataset for Question Answering over Wikidata

References

Cassan A, Figueira H, Martins A, Mendes A, Mendes P, Pinto C, Vidal D (2007) Priberam’s question answering system in a cross-language environment. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 300–309
Chapter Google Scholar
Clark P, Etzioni O (2016) My computer is an honor student - but how intelligent is it? Standardized tests as a measure of AI. AI Mag 37(1):5–12
Article Google Scholar
Ferrucci DA, Brown EW, Chu-Carroll J, Fan J, Gondek D, Kalyanpur A, Lally A, Murdock JW, Nyberg E, Prager JM, Schlaefer N, Welty CA (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79
Article Google Scholar
Forner P, Peñas A, Agirre E, Alegria I, Forascu C, Moreau N, Osenova P, Prokopidis P, Rocha P, Sacaleanu B, Sutcliffe RFE, Sang EFTK (2009) Overview of the Clef 2008 multilingual question answering track. In: Peters C, Deselaers T, Ferro N, Gonzalo J, Jones GJF, Kurimo M, Mandl T, Peñas A (eds) Evaluating systems for multilingual and multimodal information access: ninth workshop of the cross–language evaluation forum (CLEF 2008). Revised selected papers. Lecture notes in computer science (LNCS) 5706. Springer, Heidelberg, pp 262–295
Google Scholar
Giampiccolo D, Forner P, Herrera J, Peñas A, Ayache C, Forascu C, Jijkoun V, Osenova P, Rocha P, Sacaleanu B, Sutcliffe RFE (2008) Overview of the CLEF 2007 multilingual question answering track. In: Peters C, Jijkoun V, Mandl T, Müller H, Oard DW, Peñas A, Petras V, Santos D (eds) Advances in multilingual and multimodal information retrieval: eighth workshop of the cross–language evaluation forum (CLEF 2007). Revised selected papers. Lecture notes in computer science (LNCS) 5152. Springer, Heidelberg, pp 200–236
Google Scholar
Harabagiu S, Lacatusu F, Hickl A (2006) Answering complex questions with random walk models. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’06, pp 220–227
Google Scholar
Herrera J, Peñas A, Verdejo F (2005) Question answering pilot task at CLEF 2004. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: fifth workshop of the cross–language evaluation forum (CLEF 2004) revised selected papers. Lecture notes in computer science (LNCS) 3491. Springer, Heidelberg, pp 581–590
Chapter Google Scholar
Jijkoun V, de Rijke M (2007) Overview of the wiqa task at CLEF 2006. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 265–274
Chapter Google Scholar
Lamel L, Rosset S, Ayache C, Mostefa D, Turmo J, Comas P (2008) Question answering on speech transcriptions: the QAST evaluation in CLEF. In: Proceedings of the international conference on language resources and evaluation, LREC 2008, 26 May–1 June 2008, Marrakech, Morocco
Google Scholar
Laurent D (2014) English run of synapse développement at entrance exams 2014. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) CLEF 2014 labs and workshops, notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1180/, pp 1404–1414
Laurent D, Séguéla P, Nègre S (2007) Cross lingual question answering using QRISTAL for CLEF 2006. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 339–350
Chapter Google Scholar
Laurent D, Chardon B, Nègre S (2014) French run of synapse développement at entrance exams 2014. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) CLEF 2014 labs and workshops, notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1180/, pp 1415–1426
Laurent D, Chardon B, Nègre S, Pradel C, Séguéla P (2015) Reading comprehension at entrance exams 2015. In: Cappellato L, Ferro N, Jones GJF, SanJuan E (eds) CLEF 2015 labs and workshops, notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1391/
Lopez V, Unger C, Cimiano P, Motta E (2013) Evaluating question answering over linked data. Web Semantics: Sci Serv Agents World Wide Web 21(0):3–13. Special Issue on Evaluation of Semantic Technologies
Article Google Scholar
Magnini B, Romagnoli S, Vallin A, Herrera J, Peñas A, Peinado V, Verdejo MF, de Rijke M (2004) The multiple language question answering track at CLEF 2003. In: Peters C, Braschler M, Gonzalo J, Kluck M (eds) Comparative evaluation of multilingual information access systems: fourth workshop of the cross–language evaluation forum (CLEF 2003) revised selected papers. Lecture notes in computer science (LNCS) 3237. Springer, Heidelberg, pp 471–486
Chapter Google Scholar
Magnini B, Vallin A, Ayache C, Erbach G, Peñas A, de Rijke M, Rocha P, Simov KI, Sutcliffe RFE (2005) Overview of the CLEF 2004 multilingual question answering track. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: fifth workshop of the cross–language evaluation forum (CLEF 2004) Revised selected papers. Lecture notes in computer science (LNCS) 3491. Springer, Heidelberg, pp 371–391
Chapter Google Scholar
Magnini B, Giampiccolo D, Forner P, Ayache C, Jijkoun V, Osenova P, Peñas A, Rocha P, Sacaleanu B, Sutcliffe RFE (2007) Overview of the CLEF 2006 multilingual question answering track. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 223–256
Chapter Google Scholar
Montes-y-Gómez M, Pineda LV, Pérez-Coutiño MA, Soriano JMG, Arnal ES, Rosso P (2006) A full data-driven system for multiple language question answering. In: Peters C, Gey FC, Gonzalo J, Jones GJF, Kluck M, Magnini B, Müller H, de Rijke M (eds) Accessing multilingual information repositories: sixth workshop of the cross–language evaluation forum (CLEF 2005). Revised selected papers. Lecture notes in computer science (LNCS) 4022. Springer, Heidelberg, pp 420–428
Chapter Google Scholar
Morante R, Daelemans W (2011) Overview of the QA4MRE pilot task: annotating modality and negation for a machine reading evaluation. In: Petras V, Forner P, Clough P, Ferro N (eds) CLEF 2011 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
Morante R, Krallinger M, Valencia A, Daelemans W (2013) Machine reading of biomedical texts about alzheimer’s disease. In: Forner P, Navigli R, Tufis D, Ferro N (eds) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1179/
Noguera E, Llopis F, Ferrández A, Escapa A (2007) Evaluation of open-domain question answering systems within a time constraint. In: 21st International conference on advanced information networking and applications (AINA 2007). Workshops proceedings, vol 1, May 21–23, 2007, Niagara Falls, Canada, pp 260–265
Google Scholar
Peñas A, Rodrigo A (2011) A simple measure to assess non-response. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies - vol 1. Association for Computational Linguistics, HLT ’11, pp 1415–1424
Google Scholar
Peñas A, Rodrigo Á, Sama V, Verdejo F (2007) Overview of the answer validation exercise 2006. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval : seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS) 4730. Springer, Heidelberg, pp 257–264
Chapter Google Scholar
Peñas A, Forner P, Rodrigo A, Sutcliffe RFE, Forascu C, Mota C (2010a) Overview of ResPubliQA 2010: question answering evaluation over European legislation. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073, http://ceur-ws.org/Vol-1176/
Peñas A, Forner P, Sutcliffe RFE, Rodrigo A, Forascu C, Alegria I, Giampiccolo D, Moreau N, Osenova P (2010b) Overview of ResPubliQA 2009: question answering evaluation over European legislation. In: Peters C, Di Nunzio GM, Kurimo M, Mandl T, Mostefa D, Peñas A, Roda G (eds) Multilingual information access evaluation vol. I text retrieval experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS) 6241. Springer, Heidelberg, pp 174–196
Google Scholar
Peñas A, Hovy EH, Forner P, Rodrigo A, Sutcliffe RFE, Forascu C, Sporleder C (2011) Overview of QA4MRE at CLEF 2011: question answering for machine reading evaluation. In: Petras V, Forner P, Clough P, Ferro N (eds) CLEF 2011 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
Peñas A, Miyao Y, Hovy E, Forner P, Kando N (2013) Overview of QA4MRE 2013 entrance exams task. In: Forner P, Navigli R, Tufis D, Ferro N (eds) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1179/
Santos D, Cabral LM (2010) GikiCLEF: expectations and lessons learned. In: Peters C, Di Nunzio GM, Kurimo M, Mandl T, Mostefa D, Peñas A, Roda G (eds) Multilingual information access evaluation vol. I text retrieval experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS) 6241, Springer, Heidelberg, pp 212–222
Google Scholar
Tsatsaronis G, Balikas G, Malakasiotis P, Partalas I, Zschunke M, Alvers MR, Weissenborn D, Krithara A, Petridis S, Polychronopoulos D, Almirantis Y, Pavlopoulos J, Baskiotis N, Gallinari P, Artières T, Ngonga A, Heino N, Gaussier É, Barrio-Alvers L, Schroeder M, Androutsopoulos I, Paliouras G (2015) An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinf 16:138:1–138:28
Google Scholar
Vallin A, Magnini B, Giampiccolo D, Aunimo L, Ayache C, Osenova P, Peñas A, de Rijke M, Sacaleanu B, Santos D, Sutcliffe RFE (2006) Overview of the CLEF 2005 multilingual question answering track. In: Peters C, Gey FC, Gonzalo J, Jones GJF, Kluck M, Magnini B, Müller H, de Rijke M (eds) Accessing multilingual information repositories: sixth workshop of the cross–language evaluation forum (CLEF 2005). Revised selected papers. Lecture notes in computer science (LNCS) 4022. Springer, Heidelberg, pp 307–331
Chapter Google Scholar
Voorhees EM (2000) Overview of the TREC-9 question answering track. In: Proceedings of the ninth text retrieval conference, TREC 2000, Gaithersburg, Maryland, USA, November 13–16, 2000
Google Scholar
Voorhees EM (2002) Overview of TREC 2002 question answering track. In: Voorhees EM, Buckland LP (eds) Proceedings of the eleventh text retrieval conference (TREC 2002). NIST Publication 500-251, pp 57–68
Google Scholar
Voorhees EM, Tice DM (1999) The TREC-8 question answering track evaluation. In: Text retrieval conference TREC-8, pp 83–105
Google Scholar

Download references

Acknowledgements

This work has been partially funded by the Spanish Research Agency (Agencia Estatal de Investigación) LIHLITH project (PCIN-2017-085/AEI).

Author information

Authors and Affiliations

NLP&IR Group at UNED, Madrid, Spain
Anselmo Peñas & Álvaro Rodrigo
Natural Language Processing Research Unit, FBK, Trento, Italy
Bernardo Magnini
FBK - PMG, Trento, Italy
Pamela Forner & Danilo Giampiccolo
Carnegie Mellon University, Language Technologies Institute, Pittsburgh, PA, USA
Eduard Hovy
CSIS Department, University of Limerick, Limerick, Ireland
Richard Sutcliffe

Authors

Anselmo Peñas
View author publications
You can also search for this author in PubMed Google Scholar
Álvaro Rodrigo
View author publications
You can also search for this author in PubMed Google Scholar
Bernardo Magnini
View author publications
You can also search for this author in PubMed Google Scholar
Pamela Forner
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Hovy
View author publications
You can also search for this author in PubMed Google Scholar
Richard Sutcliffe
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Giampiccolo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Álvaro Rodrigo .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Padova , Padova, Italy
Nicola Ferro
Consiglio Nazionale delle Ricerche, Istituto di Scienza e Tecnologie dell’Informazione, Pisa, Italy
Carol Peters

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Peñas, A. et al. (2019). Results and Lessons of the Question Answering Track at CLEF. In: Ferro, N., Peters, C. (eds) Information Retrieval Evaluation in a Changing World. The Information Retrieval Series, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-030-22948-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-22948-1_18
Published: 14 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22947-4
Online ISBN: 978-3-030-22948-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Results and Lessons of the Question Answering Track at CLEF

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Final Report of the NTCIR-14 QA Lab-PoliInfo Task

RuBQ: A Russian Dataset for Question Answering over Wikidata

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Results and Lessons of the Question Answering Track at CLEF

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Final Report of the NTCIR-14 QA Lab-PoliInfo Task

RuBQ: A Russian Dataset for Question Answering over Wikidata

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation