More Web Proxy on the site http://driver.im/

research-article

A Probabilistic Fusion Framework

Authors:

Ella RabinovichAuthors Info & Claims

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 1463 - 1472

https://doi.org/10.1145/2983323.2983739

Published: 24 October 2016 Publication History

Abstract

There are numerous methods for fusing document lists retrieved from the same corpus in response to a query. Many of these methods are based on seemingly unrelated techniques and heuristics. Herein we present a probabilistic framework for the fusion task. The framework provides a formal basis for deriving and explaining many fusion approaches and the connections between them. Instantiating the framework using various estimates yields novel fusion methods, some of which significantly outperform state-of-the-art approaches.

References

[1]

A. Arampatzis and J. Kamps. A signal-to-noise approach to score normalization. In Proc. of CIKM, pages 797--806, 2009.

Digital Library

[2]

A. Arampatzis and S. Robertson. Modeling score distributions in information retrieval. Information Retrieval, 14(1):26--46, 2011.

Digital Library

[3]

J. A. Aslam and M. Montague. Models for metasearch. In Proc. of SIGIR, pages 276--284, 2001.

Digital Library

[4]

J. A. Aslam, V. Pavlu, and E. Yilmaz. Measure-based metasearch. In Proc. of SIGIR, pages 571--572, 2005.

Digital Library

[5]

N. Balasubramanian and J. Allan. Learning to select rankers. In Proc. of SIGIR, pages 855--856, 2010.

Digital Library

[6]

B. T. Bartell, G. W. Cottrell, and R. K. Belew. Automatic combination of multiple ranked retrieval systems. In Proc. of SIGIR, pages 173--181, 1994.

Digital Library

[7]

S. M. Beitzel, E. C. Jensen, A. Chowdhury, O. Frieder, D. A. Grossman, and N. Goharian. Disproving the fusion hypothesis: An analysis of data fusion via effective information retrieval strategies. In Proc. of SAC, pages 823--827, 2003.

Digital Library

[8]

D. Carmel and E. Yom-Tov. Estimating the Query Difficulty for Information Retrieval. Synthesis lectures on information concepts, retrieval, and services. Morgan & Claypool, 2010.

Digital Library

[9]

A. Chowdhury, O. Frieder, D. A. Grossman, and M. C. McCabe. Analyses of multiple-evidence combinations for retrieval strategies. In Proc. of SIGIR, pages 394--395, 2001.

Digital Library

[10]

K. Collins-Thompson and J. Callan. Estimation and use of uncertainty in pseudo-relevance feedback. In Proc. of SIGIR, pages 303--310, 2007.

Digital Library

[11]

G. V. Cormack, C. L. A. Clarke, and S. Büttcher. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proc. of SIGIR, pages 758--759, 2009.

Digital Library

[12]

W. B. Croft, editor. Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval. Number 7 in The Kluwer International Series on Information Retrieval. Kluwer, 2000.

Digital Library

[13]

W. B. Croft. Combining approaches to information retrieval. In CroftciteCroft:00a, chapter 1, pages 1--36.

[14]

W. B. Croft and J. Lafferty, editors. Language Modeling for Information Retrieval. Number 13 in Information Retrieval Book Series. Kluwer, 2003.

Digital Library

[15]

C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the Web. In Proc. of WWW, pages 613--622, 2001.

Digital Library

[16]

M. Efron. Generative model-based metasearch for data fusion in information retrieval. In Proc. of JCDL, pages 153--162, 2009.

Digital Library

[17]

E. A. Fox and J. A. Shaw. Combination of multiple searches. In Proc. of TREC-2, 1994.

[18]

A. K. Kozorovitzky and O. Kurland. Cluster-based fusion of retrieved lists. In Proc. of SIGIR, pages 893--902, 2011.

Digital Library

[19]

J. D. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In Proc. of SIGIR, pages 111--119, 2001.

Digital Library

[20]

M. Lalmas. A formal model for data fusion. In Proc. of FQAS, pages 274--288, 2002.

Digital Library

[21]

C. Lee, Q. Ai, W. B. Croft, and D. Sheldon. An optimization framework for merging multiple result lists. In Proc. of CIKM, pages 303--312, 2015.

Digital Library

[22]

J. H. Lee. Combining multiple evidence from different properties of weighting schemes. In Proc. of SIGIR, pages 180--188, 1995.

Digital Library

[23]

J. H. Lee. Analyses of multiple evidence combination. In Proc. of SIGIR, pages 267--276, 1997.

Digital Library

[24]

D. Lillis, F. Toolan, R. W. Collier, and J. Dunnion. Probfuse: a probabilistic approach to data fusion. In Proc. of SIGIR, pages 139--146, 2006.

Digital Library

[25]

D. Lillis, F. Toolan, R. W. Collier, and J. Dunnion. Extending probabilistic data fusion using sliding windows. In Proc. of ECIR, pages 358--369, 2008.

Digital Library

[26]

D. Lillis, L. Zhang, F. Toolan, R. W. Collier, D. Leonard, and J. Dunnion. Estimating probabilities for effective data fusion. In Proc. of SIGIR, pages 347--354, 2010.

Digital Library

[27]

R. Manmatha and H. Sever. A formal approach to score normalization for meta-search. In Proc. of HLT, pages 98--103, 2002.

Digital Library

[28]

I. Markov, A. Arampatzis, and F. Crestani. Unsupervised linear score normalization revisited. In Proc. of SIGIR, pages 1161--1162, 2012.

Digital Library

[29]

M. Montague and J. A. Aslam. Condorcet fusion for improved retrieval. In Proc. of CIKM, pages 538--548, 2002.

Digital Library

[30]

M. H. Montague and J. A. Aslam. Relevance score normalization for metasearch. In Proc. CIKM, pages 427--433, 2001.

Digital Library

[31]

K. B. Ng and P. P. Kantor. An investigation of the preconditions for effective data fusion in information retrieval: A pilot study, 1998.

[32]

E. Rabinovich, O. Rom, and O. Kurland. Utilizing relevance feedback in fusion-based retrieval. In Proc. of SIGIR, pages 313--322, 2014.

Digital Library

[33]

F. Raiber and O. Kurland. Query-performance prediction: setting the expectations straight. In Proc. of SIGIR, pages 13--22, 2014.

Digital Library

[34]

S. E. Robertson and H. Zaragoza. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4):333--389, 2009.

Digital Library

[35]

D. Sheldon, M. Shokouhi, M. Szummer, and N. Craswell. LambdaMerge: Merging the results of query reformulations. In Proc. of WSDM, pages 795--804, 2011.

Digital Library

[36]

M. Shokouhi. Segmentation of search engine results for effective data-fusion. In Proc. of ECIR, pages 185--197, 2007.

Digital Library

[37]

N. Soskin, O. Kurland, and C. Domshlak. Navigating in the dark: Modeling uncertainty in ad hoc retrieval using multiple relevance models. In ICTIR, pages 79--91, 2009.

Digital Library

[38]

T. Tsikrika and M. Lalmas. Merging techniques for performing data fusion on the Web. Proc. of CIKM, pages 127--134, 2001.

Digital Library

[39]

C. C. Vogt and G. W. Cottrell. Predicting the performance of linearly combined IR systems. In Proc. of SIGIR, pages 190--196, 1998.

Digital Library

[40]

C. C. Vogt and G. W. Cottrell. Fusion via linear combination of scores. Information Retrieval, 1(3):151--173, 1999.

Digital Library

[41]

E. M. Voorhees. Overview of the TREC 2005 robust retrieval task. In Proc. of TREC-14, 2005.

[42]

S. Wu. Applying statistical principles to data fusion in information retrieval. Expert Systems with Applications, 36(2):2997--3006, 2009.

Digital Library

[43]

S. Wu. Data Fusion in Information Retrieval. Springer Publishing Company, Incorporated, 2012.

Digital Library

[44]

S. Wu and F. Crestani. Data fusion with estimated weights. In Proc. of CIKM, pages 648--651, 2002.

Digital Library

[45]

S. Wu, F. Crestani, and Y. Bi. Evaluating score normalization methods in data fusion. In Proc. of AIRS, pages 642--648, 2006.

Digital Library

[46]

H. Zaragoza, D. Hiemstra, and M. Tipping. Bayesian extension to the language model for ad hoc information retrieval. In Proc. of SIGIR, pages 4--9, 2003.

Digital Library

Cited By

Tyomkin LKurland OHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Analyzing Fusion Methods Using the Condorcet RuleProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657912(2281-2285)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657912
Hendriksen GHiemstra Dde Vries A(2024)Weighted AUReC: Handling Skew in Shard Map Quality Estimation for Selective SearchAdvances in Information Retrieval10.1007/978-3-031-56066-8_10(87-96)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56066-8_10
Tyomkin LKurland OYoshioka MKiseleva JAliannejadi M(2023)Revisiting Condorcet FusionProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605140(199-204)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.1145/3578337.3605140
Show More Cited By

Index Terms

A Probabilistic Fusion Framework
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Combination, fusion and federated search

Recommendations

Threshold-optimized decision-level fusion and its application to biometrics

Fusion is a popular practice to increase the reliability of biometric verification. In this paper, we propose an optimal fusion scheme at decision level by the AND or OR rule, based on optimizing matching score thresholds. The proposed fusion scheme ...
Palmprint identification using feature-level fusion

In this paper, we propose a feature-level fusion approach for improving the efficiency of palmprint identification. Multiple elliptical Gabor filters with different orientations are employed to extract the phase information on a palmprint image, which ...
Segmentation fusion based on neighboring information for MR brain images

In this paper, we study on how to boost image segmentation algorithms. First of all, a novel fusion scheme is proposed to combine different segmentations with mutual information to reduce misclassified pixels and obtain an accurate segmentation. As the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

October 2016

2566 pages

ISBN:9781450340731

DOI:10.1145/2983323

General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

fusion

Qualifiers

Research-article

Funding Sources

Technion-Microsoft Electronic Commerce Research Center
Technion-Israel Institute of Technology

Conference

CIKM'16

Sponsor:

CIKM'16: ACM Conference on Information and Knowledge Management

October 24 - 28, 2016

Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
310
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Tyomkin LKurland OHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Analyzing Fusion Methods Using the Condorcet RuleProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657912(2281-2285)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657912
Hendriksen GHiemstra Dde Vries A(2024)Weighted AUReC: Handling Skew in Shard Map Quality Estimation for Selective SearchAdvances in Information Retrieval10.1007/978-3-031-56066-8_10(87-96)Online publication date: 24-Mar-2024
https://dl.acm.org/doi/10.1007/978-3-031-56066-8_10
Tyomkin LKurland OYoshioka MKiseleva JAliannejadi M(2023)Revisiting Condorcet FusionProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605140(199-204)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.1145/3578337.3605140
Xing YZhan Y(2023)Performance prediction of multivariable linear regression based on the optimal influencing factors for ranking aggregation in crowdsourcing taskData Technologies and Applications10.1108/DTA-09-2022-034658:2(176-200)Online publication date: 4-Jul-2023
https://doi.org/10.1108/DTA-09-2022-0346
Strahl JPeltonen JFloréen P(2021)Directing and Combining Multiple Queries for Exploratory Search by Visual Interactive Intent ModelingHuman-Computer Interaction – INTERACT 202110.1007/978-3-030-85613-7_34(514-535)Online publication date: 26-Aug-2021
https://doi.org/10.1007/978-3-030-85613-7_34
Borges LMartins BCallan J(2021)Assessing the Benefits of Model Ensembles in Neural Re-ranking for Passage RetrievalAdvances in Information Retrieval10.1007/978-3-030-72240-1_19(225-232)Online publication date: 30-Mar-2021
https://doi.org/10.1007/978-3-030-72240-1_19
Lillis D(2020)On the Evaluation of Data Fusion for Information RetrievalProceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation10.1145/3441501.3441506(54-57)Online publication date: 16-Dec-2020
https://dl.acm.org/doi/10.1145/3441501.3441506
Roitman HMass YFang YZhang YAllan JBalog KCarterette BGuo J(2019)Utilizing Passages in Fusion-based Document RetrievalProceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3341981.3344212(59-66)Online publication date: 26-Sep-2019
https://dl.acm.org/doi/10.1145/3341981.3344212
Roy DMitra MGanguly D(2018)To Clean or Not to CleanJournal of Data and Information Quality10.1145/324218010:4(1-25)Online publication date: 29-Oct-2018
https://dl.acm.org/doi/10.1145/3242180
Roitman HSong DLiu TSun LBruza PMelucci MSebastiani FYang G(2018)Utilizing Pseudo-Relevance Feedback in Fusion-based RetrievalProceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3234944.3234969(203-206)Online publication date: 10-Sep-2018
https://dl.acm.org/doi/10.1145/3234944.3234969
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents