More Web Proxy on the site http://driver.im/

research-article

Information Needs, Queries, and Query Performance Prediction

Authors:

J. Shane CulpepperAuthors Info & Claims

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 395 - 404

https://doi.org/10.1145/3331184.3331253

Published: 18 July 2019 Publication History

Abstract

The query performance prediction (QPP) task is to estimate the effectiveness of a search performed in response to a query with no relevance judgments. Existing QPP methods do not account for the effectiveness of a query in representing the underlying information need. We demonstrate the far-reaching implications of this reality using standard TREC-based evaluation of QPP methods: their relative prediction quality patterns vary with respect to the effectiveness of queries used to represent the information needs. Motivated by our findings, we revise the basic probabilistic formulation of the QPP task by accounting for the information need and its connection to the query. We further explore this connection by proposing a novel QPP approach that utilizes information about a set of queries representing the same information need. Predictors instantiated from our approach using a wide variety of existing QPP methods post prediction quality that substantially transcends that of applying these methods, as is standard, using a single query representing the information need. Additional in-depth empirical analysis of different aspects of our approach further attests to the crucial role of query effectiveness in QPP.

Supplementary Material

MP4 File (cite1-11h40-d2.mp4)

Download
404.90 MB

References

[1]

G. Amati, C. Carpineto, and G. Romano. 2004. Query difficulty, robustness, and selective application of query expansion. In Proc. of ECIR. 127--137.

[2]

J. A. Aslam and V. Pavlu. 2007. Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions. In Proc. of ECIR. 198--209.

Digital Library

[3]

P. Bailey, A. Moffat, F. Scholer, and P. Thomas. 2016. UQV100: A Test Collection with Query Variability. In Proc. of SIGIR. 725--728.

Digital Library

[4]

P. Bailey, A. Moffat, F. Scholer, and P. Thomas. 2017. Retrieval Consistency in the Presence of Query Variations. In Proc. of SIGIR. 395--404.

Digital Library

[5]

N. Balasubramanian and J. Allan. 2010. Learning to select rankers. In Proc. of SIGIR. 855--856.

Digital Library

[6]

N. J. Belkin, C. C., W. B. Croft, and J. P. Callan. 1993. The effect of multiple query representations on information retrieval system performance. In Proc. of SIGIR. 339--346.

Digital Library

[7]

N. J. Belkin, P. Kantor, E. A. Fox, and J. A. Shaw. 1995. Combining evidence of multiple query representation for information retrieval. Information Processing and Management, Vol. 31, 3 (1995), 431--448.

Digital Library

[8]

R. Benham and J. S. Culpepper. 2017. Risk-Reward Trade-offs in Rank Fusion. In Proc. of ADCS. 1--8.

Digital Library

[9]

Y. Bernstein, B. Billerbeck, S. Garcia, N. Lester, F. Scholer, and J. Zobel. 2005. RMIT University at TREC 2005: Terabyte and Robust Track. In Proc. of TREC-14.

[10]

D. Carmel and E. Yom-Tov. 2010. Estimating the Query Difficulty for Information Retrieval. Morgan & Claypool Publishers.

Digital Library

[11]

D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. 2006. What makes a query difficult? In Proc. of SIGIR. 390--397.

Digital Library

[12]

A.-G. Chifu, L. Laporte, J. Mothe, and Md Z. Ullah. 2018. Query Performance Prediction Focused on Summarized Letor Features. In Proc. of SIGIR. 1177--1180.

Digital Library

[13]

N. Craswell and M. Szummer. 2007. Random walks on the click graph. In Proc. of SIGIR. 239--246.

Digital Library

[14]

S. Cronen-Townsend, Y. Zhou, and W. B. Croft. 2002. Predicting query performance. In Proc. of SIGIR. 299--306.

Digital Library

[15]

S. Cronen-Townsend, Y. Zhou, and W. B. Croft. 2004. A Language Modeling Framework for Selective Query Expansion. Technical Report IR-338. Center for Intelligent Information Retrieval, University of Massachusetts.

[16]

R. Cummins. 2011. Predicting Query Performance Directly from Score Distributions. In Proc. of AIRS. 315--326.

Digital Library

[17]

R. Cummins. 2014. Document Score Distribution Models for Query Performance Inference and Prediction. ACM Transactions on Information Systems, Vol. 32, 1 (2014), 2.

Digital Library

[18]

R. Cummins, J. M. Jose, and C. O'Riordan. 2011. Improved query performance prediction using standard deviation. In Proc. of SIGIR. 1089--1090.

Digital Library

[19]

V. Dang, M. Bendersky, and W. B. Croft. 2010. Learning to rank query reformulations. In In Proc. of SIGIR. 807--808.

Digital Library

[20]

F. Diaz. 2007. Performance prediction using spatial autocorrelation. In Proc. of SIGIR. 583--590.

Digital Library

[21]

C. Hauff, L. Azzopardi, and D. Hiemstra. 2009. The Combination and Evaluation of Query Performance Prediction Methods. In Proc. of ECIR. 301--312.

Digital Library

[22]

C. Hauff, D. Hiemstra, and F. de Jong. 2008. A survey of pre-retrieval query performance predictors. In Proc. of CIKM. 1419--1420.

Digital Library

[23]

B. He and I. Ounis. 2004. Inferring Query Performance Using Pre-retrieval Predictors. In Proc. of SPIRE. 43--54.

[24]

R. Jones, B. Rey, O. Madani, and W. Greiner. 2006. Generating query substitutions. In Proc. of WWW. 387--396.

Digital Library

[25]

O. Kurland, A. Shtok, S. Hummel, F. Raiber, D. Carmel, and O. Rom. 2012. Back to the Roots: A Probabilistic Framework for Query-performance Prediction. In Proc. of CIKM. 823--832.

Digital Library

[26]

K. Kwok, L. Grunfeld, H. Sun, P. Deng, and N. Dinstl. 2004. TREC 2004 Robust Track Experiments using PIRCS. In Proc. of TREC-13.

[27]

V. Lavrenko and W. B. Croft. 2001. Relevance-Based Language Models. In Proc. of SIGIR. 120--127.

Digital Library

[28]

D. Metzler and W. B. Croft. 2005. A Markov random field model for term dependencies. In Proc. of SIGIR. 472--479.

Digital Library

[29]

J. Mothe and L. Tanguy. 2005. Linguistic features to predict query difficulty. In ACM SIGIR 2005 Workshop on Predicting Query Difficulty - Methods and Applications. http://www.haifa.il.ibm.com/sigir05-qp/papers/Mothe.pdf

[30]

J. Pérez-Iglesias and L. Araujo. 2010. Standard Deviation as a Query Hardness Estimator. In Proc. of SPIRE. 207--212.

Digital Library

[31]

F. Raiber and O. Kurland. 2014. Query-performance prediction: Setting the expectations straight. In Proc. of SIGIR. 13--22.

Digital Library

[32]

S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. 1994. Okapi at TREC-3. In Proc. of TREC-3.

[33]

H. Roitman. 2018. An Extended Query Performance Prediction Framework Utilizing Passage-Level Information. In Proc. of ICTIR. 35--42.

Digital Library

[34]

H. Roitman. 2018. Query Performance Prediction using Passage Information. In Proc. of SIGIR. 893--896.

Digital Library

[35]

H. Roitman, S. Erera, O. S. Shalom, and B. Weiner. 2017. Enhanced Mean Retrieval Score Estimation for Query Performance Prediction. In Proc. of ICTIR. 35--42.

Digital Library

[36]

H. Roitman, S. Erera, and B. Weiner. 2017. Robust Standard Deviation Estimation for Query Performance Prediction. In Proc. of ICTIR. 245--248.

Digital Library

[37]

H. Scells, L. Azzopardi, G. Zuccon, and B. Koopman. 2018. Query Variation Performance Prediction for Systematic Reviews. In Proc. of SIGIR. 1089--1092.

Digital Library

[38]

F. Scholer and S. Garcia. 2009. A case for improved evaluation of query difficulty prediction. In Proc. of SIGIR. 640--641.

Digital Library

[39]

F. Scholer, H. E. Williams, and A. Turpin. 2004. Query association surrogates for Web search. JASIST, Vol. 55, 7 (2004), 637--650.

Digital Library

[40]

D. Sheldon, M. Shokouhi, M. Szummer, and N. Craswell. 2011. LambdaMerge: merging the results of query reformulations. In Proc. of WSDM. 795--804.

Digital Library

[41]

A. Shtok, O. Kurland, and D. Carmel. 2009. Predicting query performance by query-drift estimation. In Proc. of ICTIR. 305--312.

Digital Library

[42]

A. Shtok, O. Kurland, and D. Carmel. 2010. Using statistical decision theory and relevance models for query-performance prediction. In Proccedings of SIGIR. 259--266.

Digital Library

[43]

A. Shtok, O. Kurland, and D. Carmel. 2016. Query Performance Prediction Using Reference Lists. ACM Trans. Inf. Syst., Vol. 34, 4 (2016), 19:1--19:34.

Digital Library

[44]

M. Sondak, A. Shtok, and O. Kurland. 2013. Estimating query representativeness for query-performance prediction. In Proc. of SIGIR. 853--856.

Digital Library

[45]

F. Song and W. B. Croft. 1999. A general language model for information retrieval. In Proc. of SIGIR. 279--280.

Digital Library

[46]

K. Sparck Jones, S. Walker, and S. E. Robertson. 2000. A probabilistic model of information retrieval: development and comparative experiments - Part 1. Information Processing and Management, Vol. 36, 6 (2000), 779--808.

Digital Library

[47]

Y. Tao and S. Wu. 2014. Query Performance Prediction By Considering Score Magnitude and Variance Together. In Proc. of CIKM. 1891--1894.

Digital Library

[48]

P. Thomas, F. Scholer, P. Bailey, and A. Moffat. 2017. Tasks, Queries, and Rankers in Pre-Retrieval Performance Prediction. In Proc. of ADCS. 11:1--11:4.

Digital Library

[49]

S. Tomlinson. 2004. Robust, Web and Terabyte Retrieval with Hummingbird Search Server at TREC 2004. In Proc. of TREC-13.

[50]

Eduardo Vicente-López, Luis M. Campos, Juan M. Fernández-Luna, and Juan F. Huete. 2018. Predicting IR Personalization Performance Using Pre-retrieval Query Predictors. J. Intell. Inf. Syst., Vol. 51, 3 (2018), 597--620.

Digital Library

[51]

V. Vinay, I. J. Cox, N. Milic-Frayling, and K. R. Wood. 2006. On ranking the effectiveness of searches. In Proc. of SIGIR. 398--404.

Digital Library

[52]

E. M. Voorhees and D. K. Harman. 2005. TREC: Experiments and evaluation in information retrieval. The MIT Press.

Digital Library

[53]

W. Webber, A. Moffat, and J. Zobel. 2010. A Similarity Measure for Indefinite Rankings. ACM Trans. Inf. Syst., Vol. 28, 4, Article 20 (Nov. 2010), 38 pages.

Digital Library

[54]

M. Winaver, O. Kurland, and C. Domshlak. 2007. Towards robust query expansion: Model selection in the language model framework to retrieval. In Proc. of SIGIR. 729--730.

Digital Library

[55]

D. Yin, Y. Hu, J. Tang, T. Daly, M. Zhou, H. Ouyang, J. Chen, C. Kang, H. Deng, C. Nobata, J.-M. Langlois, and Y. Chang. 2016. Ranking relevance in yahoo search. In Proc. of KDD. 323--332.

Digital Library

[56]

E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. 2005. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In Proc. of SIGIR. 512--519.

Digital Library

[57]

H. Zamani, W. B. Croft, and J. S. Culpepper. 2018. Neural Query Performance Prediction using Weak Supervision from Multiple Signals. In Proc. of SIGIR. 105--114.

Digital Library

[58]

C.-X. Zhai and J. D. Lafferty. 2001. A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In Proc. of SIGIR. 334--342.

Digital Library

[59]

Y. Zhao, F. Scholer, and Y. Tsegay. 2008. Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence. In Proc. of ECIR. 52--64.

Digital Library

[60]

Y. Zhou and W. B. Croft. 2006. Ranking robustness: a novel framework to predict query performance. In Proc. of CIKM. 567--574.

Digital Library

[61]

Y. Zhou and W. B. Croft. 2007. Query performance prediction in web search environments. In Proc. of SIGIR. 543--550.

Digital Library

Cited By

Rashidi LZobel JMoffat AOosterhuis HBast HXiong C(2024)Query Variability and Experimental Consistency: A Concerning Case StudyProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672519(35-41)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3664190.3672519
Anand AV VSetty VAnand AHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)The Surprising Effectiveness of Rankers trained on Expanded QueriesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657938(2652-2656)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657938
Faggioli GFormal TLupart SMarchesin SClinchant SFerro NPiwowarski BYoshioka MKiseleva JAliannejadi M(2023)Towards Query Performance Prediction for Neural Information Retrieval: Challenges and OpportunitiesProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605142(51-63)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.1145/3578337.3605142
Show More Cited By

Index Terms

Information Needs, Queries, and Query Performance Prediction
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
      1. Query reformulation
      2. Query representation

Recommendations

Is Query Performance Prediction With Multiple Query Variations Harder Than Topic Performance Prediction?
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Accurately estimating the retrieval effectiveness of different queries representing distinct information needs is a problem in Information Retrieval (IR) that has been studied for over 20 years. Recent work showed that the problem can be significantly ...
Predicting Query Performance by Query-Drift Estimation

Predicting query performance, that is, the effectiveness of a search performed in response to a query, is a highly important and challenging problem. We present a novel approach to this task that is based on measuring the standard deviation of retrieval ...
Query-performance prediction: setting the expectations straight
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

The query-performance prediction task has been described as estimating retrieval effectiveness in the absence of relevance judgments. The expectations throughout the years were that improved prediction techniques would translate to improved retrieval ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2019

1512 pages

ISBN:9781450361729

DOI:10.1145/3331184

General Chairs:
Benjamin Piwowarski
CNRS - Sorbonne Universite, France
,
Max Chevalier
Universite de Toulouse, CNRS, France
,
Eric Gaussier
Universite Grenoble Alpes, CNRS, France
,
Program Chairs:
Yoelle Maarek
Amazon Research, Israel
,
Jian-Yun Nie
University of Montreal, Canada
,
Falk Scholer
RMIT University, Australia

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SIGIR '19

Sponsor:

SIGIR

SIGIR '19: The 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 21 - 25, 2019

Paris, France

Acceptance Rates

SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
842
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)5

Reflects downloads up to 18 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rashidi LZobel JMoffat AOosterhuis HBast HXiong C(2024)Query Variability and Experimental Consistency: A Concerning Case StudyProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672519(35-41)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3664190.3672519
Anand AV VSetty VAnand AHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)The Surprising Effectiveness of Rankers trained on Expanded QueriesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657938(2652-2656)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657938
Faggioli GFormal TLupart SMarchesin SClinchant SFerro NPiwowarski BYoshioka MKiseleva JAliannejadi M(2023)Towards Query Performance Prediction for Neural Information Retrieval: Challenges and OpportunitiesProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605142(51-63)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.1145/3578337.3605142
Faggioli GFerro NMuntean CPerego RTonellotto NChen HDuh WHuang HKato MMothe JPoblete B(2023)A Geometric Framework for Query Performance Prediction in Conversational SearchProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591625(1355-1365)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591625
Wang BLiu J(2023)Characterizing and Early Predicting User Performance for Adaptive Search Path RecommendationProceedings of the Association for Information Science and Technology10.1002/pra2.79960:1(408-420)Online publication date: 22-Oct-2023
https://doi.org/10.1002/pra2.799
He ZYu JGuo B(2022)Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning ApproachSymmetry10.3390/sym1401005514:1(55)Online publication date: 1-Jan-2022
https://doi.org/10.3390/sym14010055
Datta SGanguly DMitra MGreene D(2022)A Relative Information Gain-based Query Performance Prediction Framework with Generated Query VariantsACM Transactions on Information Systems10.1145/354511241:2(1-31)Online publication date: 21-Dec-2022
https://dl.acm.org/doi/10.1145/3545112
Zendel OEbrahim MCulpepper JMoffat AScholer FAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Can Users Predict Relative Query Effectiveness?Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531893(2545-2549)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3531893
Faggioli GZendel OCulpepper JFerro NScholer F(2022)sMARE: a new paradigm to evaluate and understand query performance prediction methodsInformation Retrieval10.1007/s10791-022-09407-w25:2(94-122)Online publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1007/s10791-022-09407-w
Chen XHe BSun L(2022)Groupwise Query Performance Prediction with BERTAdvances in Information Retrieval10.1007/978-3-030-99739-7_8(64-74)Online publication date: 10-Apr-2022
https://dl.acm.org/doi/10.1007/978-3-030-99739-7_8
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents