[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2983323.2983739acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A Probabilistic Fusion Framework

Published: 24 October 2016 Publication History

Abstract

There are numerous methods for fusing document lists retrieved from the same corpus in response to a query. Many of these methods are based on seemingly unrelated techniques and heuristics. Herein we present a probabilistic framework for the fusion task. The framework provides a formal basis for deriving and explaining many fusion approaches and the connections between them. Instantiating the framework using various estimates yields novel fusion methods, some of which significantly outperform state-of-the-art approaches.

References

[1]
A. Arampatzis and J. Kamps. A signal-to-noise approach to score normalization. In Proc. of CIKM, pages 797--806, 2009.
[2]
A. Arampatzis and S. Robertson. Modeling score distributions in information retrieval. Information Retrieval, 14(1):26--46, 2011.
[3]
J. A. Aslam and M. Montague. Models for metasearch. In Proc. of SIGIR, pages 276--284, 2001.
[4]
J. A. Aslam, V. Pavlu, and E. Yilmaz. Measure-based metasearch. In Proc. of SIGIR, pages 571--572, 2005.
[5]
N. Balasubramanian and J. Allan. Learning to select rankers. In Proc. of SIGIR, pages 855--856, 2010.
[6]
B. T. Bartell, G. W. Cottrell, and R. K. Belew. Automatic combination of multiple ranked retrieval systems. In Proc. of SIGIR, pages 173--181, 1994.
[7]
S. M. Beitzel, E. C. Jensen, A. Chowdhury, O. Frieder, D. A. Grossman, and N. Goharian. Disproving the fusion hypothesis: An analysis of data fusion via effective information retrieval strategies. In Proc. of SAC, pages 823--827, 2003.
[8]
D. Carmel and E. Yom-Tov. Estimating the Query Difficulty for Information Retrieval. Synthesis lectures on information concepts, retrieval, and services. Morgan & Claypool, 2010.
[9]
A. Chowdhury, O. Frieder, D. A. Grossman, and M. C. McCabe. Analyses of multiple-evidence combinations for retrieval strategies. In Proc. of SIGIR, pages 394--395, 2001.
[10]
K. Collins-Thompson and J. Callan. Estimation and use of uncertainty in pseudo-relevance feedback. In Proc. of SIGIR, pages 303--310, 2007.
[11]
G. V. Cormack, C. L. A. Clarke, and S. Büttcher. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proc. of SIGIR, pages 758--759, 2009.
[12]
W. B. Croft, editor. Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval. Number 7 in The Kluwer International Series on Information Retrieval. Kluwer, 2000.
[13]
W. B. Croft. Combining approaches to information retrieval. In CroftciteCroft:00a, chapter 1, pages 1--36.
[14]
W. B. Croft and J. Lafferty, editors. Language Modeling for Information Retrieval. Number 13 in Information Retrieval Book Series. Kluwer, 2003.
[15]
C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the Web. In Proc. of WWW, pages 613--622, 2001.
[16]
M. Efron. Generative model-based metasearch for data fusion in information retrieval. In Proc. of JCDL, pages 153--162, 2009.
[17]
E. A. Fox and J. A. Shaw. Combination of multiple searches. In Proc. of TREC-2, 1994.
[18]
A. K. Kozorovitzky and O. Kurland. Cluster-based fusion of retrieved lists. In Proc. of SIGIR, pages 893--902, 2011.
[19]
J. D. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proc. of SIGIR, pages 111--119, 2001.
[20]
M. Lalmas. A formal model for data fusion. In Proc. of FQAS, pages 274--288, 2002.
[21]
C. Lee, Q. Ai, W. B. Croft, and D. Sheldon. An optimization framework for merging multiple result lists. In Proc. of CIKM, pages 303--312, 2015.
[22]
J. H. Lee. Combining multiple evidence from different properties of weighting schemes. In Proc. of SIGIR, pages 180--188, 1995.
[23]
J. H. Lee. Analyses of multiple evidence combination. In Proc. of SIGIR, pages 267--276, 1997.
[24]
D. Lillis, F. Toolan, R. W. Collier, and J. Dunnion. Probfuse: a probabilistic approach to data fusion. In Proc. of SIGIR, pages 139--146, 2006.
[25]
D. Lillis, F. Toolan, R. W. Collier, and J. Dunnion. Extending probabilistic data fusion using sliding windows. In Proc. of ECIR, pages 358--369, 2008.
[26]
D. Lillis, L. Zhang, F. Toolan, R. W. Collier, D. Leonard, and J. Dunnion. Estimating probabilities for effective data fusion. In Proc. of SIGIR, pages 347--354, 2010.
[27]
R. Manmatha and H. Sever. A formal approach to score normalization for meta-search. In Proc. of HLT, pages 98--103, 2002.
[28]
I. Markov, A. Arampatzis, and F. Crestani. Unsupervised linear score normalization revisited. In Proc. of SIGIR, pages 1161--1162, 2012.
[29]
M. Montague and J. A. Aslam. Condorcet fusion for improved retrieval. In Proc. of CIKM, pages 538--548, 2002.
[30]
M. H. Montague and J. A. Aslam. Relevance score normalization for metasearch. In Proc. CIKM, pages 427--433, 2001.
[31]
K. B. Ng and P. P. Kantor. An investigation of the preconditions for effective data fusion in information retrieval: A pilot study, 1998.
[32]
E. Rabinovich, O. Rom, and O. Kurland. Utilizing relevance feedback in fusion-based retrieval. In Proc. of SIGIR, pages 313--322, 2014.
[33]
F. Raiber and O. Kurland. Query-performance prediction: setting the expectations straight. In Proc. of SIGIR, pages 13--22, 2014.
[34]
S. E. Robertson and H. Zaragoza. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4):333--389, 2009.
[35]
D. Sheldon, M. Shokouhi, M. Szummer, and N. Craswell. LambdaMerge: Merging the results of query reformulations. In Proc. of WSDM, pages 795--804, 2011.
[36]
M. Shokouhi. Segmentation of search engine results for effective data-fusion. In Proc. of ECIR, pages 185--197, 2007.
[37]
N. Soskin, O. Kurland, and C. Domshlak. Navigating in the dark: Modeling uncertainty in ad hoc retrieval using multiple relevance models. In ICTIR, pages 79--91, 2009.
[38]
T. Tsikrika and M. Lalmas. Merging techniques for performing data fusion on the Web. Proc. of CIKM, pages 127--134, 2001.
[39]
C. C. Vogt and G. W. Cottrell. Predicting the performance of linearly combined IR systems. In Proc. of SIGIR, pages 190--196, 1998.
[40]
C. C. Vogt and G. W. Cottrell. Fusion via linear combination of scores. Information Retrieval, 1(3):151--173, 1999.
[41]
E. M. Voorhees. Overview of the TREC 2005 robust retrieval task. In Proc. of TREC-14, 2005.
[42]
S. Wu. Applying statistical principles to data fusion in information retrieval. Expert Systems with Applications, 36(2):2997--3006, 2009.
[43]
S. Wu. Data Fusion in Information Retrieval. Springer Publishing Company, Incorporated, 2012.
[44]
S. Wu and F. Crestani. Data fusion with estimated weights. In Proc. of CIKM, pages 648--651, 2002.
[45]
S. Wu, F. Crestani, and Y. Bi. Evaluating score normalization methods in data fusion. In Proc. of AIRS, pages 642--648, 2006.
[46]
H. Zaragoza, D. Hiemstra, and M. Tipping. Bayesian extension to the language model for ad hoc information retrieval. In Proc. of SIGIR, pages 4--9, 2003.

Cited By

View all
  • (2024)Analyzing Fusion Methods Using the Condorcet RuleProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657912(2281-2285)Online publication date: 10-Jul-2024
  • (2024)Weighted AUReC: Handling Skew in Shard Map Quality Estimation for Selective SearchAdvances in Information Retrieval10.1007/978-3-031-56066-8_10(87-96)Online publication date: 24-Mar-2024
  • (2023)Revisiting Condorcet FusionProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605140(199-204)Online publication date: 9-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
October 2016
2566 pages
ISBN:9781450340731
DOI:10.1145/2983323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. fusion

Qualifiers

  • Research-article

Funding Sources

  • Technion-Microsoft Electronic Commerce Research Center
  • Technion-Israel Institute of Technology

Conference

CIKM'16
Sponsor:
CIKM'16: ACM Conference on Information and Knowledge Management
October 24 - 28, 2016
Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Analyzing Fusion Methods Using the Condorcet RuleProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657912(2281-2285)Online publication date: 10-Jul-2024
  • (2024)Weighted AUReC: Handling Skew in Shard Map Quality Estimation for Selective SearchAdvances in Information Retrieval10.1007/978-3-031-56066-8_10(87-96)Online publication date: 24-Mar-2024
  • (2023)Revisiting Condorcet FusionProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605140(199-204)Online publication date: 9-Aug-2023
  • (2023)Performance prediction of multivariable linear regression based on the optimal influencing factors for ranking aggregation in crowdsourcing taskData Technologies and Applications10.1108/DTA-09-2022-034658:2(176-200)Online publication date: 4-Jul-2023
  • (2021)Directing and Combining Multiple Queries for Exploratory Search by Visual Interactive Intent ModelingHuman-Computer Interaction – INTERACT 202110.1007/978-3-030-85613-7_34(514-535)Online publication date: 26-Aug-2021
  • (2021)Assessing the Benefits of Model Ensembles in Neural Re-ranking for Passage RetrievalAdvances in Information Retrieval10.1007/978-3-030-72240-1_19(225-232)Online publication date: 30-Mar-2021
  • (2020)On the Evaluation of Data Fusion for Information RetrievalProceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3441501.3441506(54-57)Online publication date: 16-Dec-2020
  • (2019)Utilizing Passages in Fusion-based Document RetrievalProceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3341981.3344212(59-66)Online publication date: 26-Sep-2019
  • (2018)To Clean or Not to CleanJournal of Data and Information Quality10.1145/324218010:4(1-25)Online publication date: 29-Oct-2018
  • (2018)Utilizing Pseudo-Relevance Feedback in Fusion-based RetrievalProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234969(203-206)Online publication date: 10-Sep-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media