[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1835449.1835461acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Ranking using multiple document types in desktop search

Published: 19 July 2010 Publication History

Abstract

A typical desktop environment contains many document types (email, presentations, web pages, pdfs, etc.) each with different metadata. Predicting which types of documents a user is looking for in the context of a given query is a crucial part of providing effective desktop search. The problem is similar to selecting resources in distributed IR, but there are some important differences.
In this paper, we quantify the impact of type prediction in producing a merged ranking for desktop search and introduce a new prediction method that exploits type-specific metadata. In addition, we show that type prediction performance and search effectiveness can be further enhanced by combining existing methods of type prediction using discriminative learning models. Our experiments employ pseudo-desktop collections and a human computation game for acquiring realistic and reusable queries.

References

[1]
J. Arguello, J. Callan, and F. Diaz. Classification-based resource selection. In CIKM '09, pages 1277--1286, New York, NY, USA, 2009. ACM.
[2]
J. Arguello, F. Diaz, J. Callan, and J.-F. Crespo. Sources of evidence for vertical selection. In SIGIR '09, pages 315--322, New York, NY, USA, 2009. ACM.
[3]
L. Azzopardi, M. de Rijke, and K. Balog. Building simulated queries for known-item topics: an analysis using six european languages. In SIGIR '07, pages 455--462, New York, NY, USA, 2007. ACM.
[4]
J. P. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In SIGIR '95, pages 21--28, New York, NY, USA, 1995. ACM.
[5]
S. Chernov, P. Serdyukov, P.-A. Chirita, G. Demartini, and W. Nejdl. Building a desktop search test-bed. In ECIR' 07, pages 686--690, 2007.
[6]
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In SIGIR '02, pages 299--306, New York, NY, USA, 2002. ACM.
[7]
S. Dumais, E. Cutrell, J. Cadiz, G. Jancke, R. Sarin, and D. C. Robbins. Stuff i've seen: a system for personal information retrieval and re-use. In SIGIR '03, pages 72--79, New York, NY, USA, 2003. ACM.
[8]
D. Elsweiler and I. Ruthven. Towards task-based personal information management evaluations. In SIGIR '07, pages 23--30, New York, NY, USA, 2007. ACM.
[9]
T. Joachims. Optimizing search engines using clickthrough data. In KDD '02, pages 133--142, New York, NY, USA, 2002. ACM.
[10]
J. Kim and W. B. Croft. Retrieval experiments using pseudo-desktop collections. In CIKM '09, pages 1297--1306. ACM, 2009.
[11]
J. Kim, X. Xue, and W. B. Croft. A Probabilistic Retrieval Model for Semi-structured Data. In Proceedings of ECIR '09. Springer, 2009.
[12]
H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. Improving search engines using human computation games. In CIKM' 09, pages 275--284, 2009.
[13]
P. Ogilvie and J. Callan. Combining document representations for known-item search. In SIGIR '03, pages 143--150, New York, NY, USA, 2003. ACM.
[14]
W. Press, S. Teukolsky, W. Vetterling, and B. Flannery. Numerical Recipes in C. Cambridge University Press, Cambridge, UK, 2nd edition, 1992.
[15]
S. Robertson, H. Zaragoza, and M. Taylor. Simple bm25 extension to multiple weighted fields. In CIKM '04, pages 42--49, New York, NY, USA, 2004. ACM.
[16]
J. Seo and W. B. Croft. Blog site search using resource selection. In CIKM '08, pages 1053--1062, New York, NY, USA, 2008. ACM.
[17]
L. Si and J. Callan. Relevant document distribution estimation method for resource selection. In SIGIR '03, pages 298--305, New York, NY, USA, 2003. ACM.
[18]
L. Si, R. Jin, J. Callan, and P. Ogilvie. A language modeling framework for resource selection and results merging. In CIKM '02, pages 391--397, New York, NY, USA, 2002. ACM.
[19]
P. Thomas and D. Hawking. Server selection methods in personal metasearch: a comparative empirical study. Inf. Retr., 12(5):581--604, 2009.
[20]
L. von Ahn and L. Dabbish. Designing games with a purpose. Commun. ACM, 51(8):58--67, 2008.

Cited By

View all
  • (2022)'It's on the tip of my tongue'Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498421(48-56)Online publication date: 11-Feb-2022
  • (2021)Improving Cloud Storage Search with User ActivityProceedings of the 14th ACM International Conference on Web Search and Data Mining10.1145/3437963.3441780(508-516)Online publication date: 8-Mar-2021
  • (2019)The Effects of Working Memory, Perceptual Speed, and Inhibition in Aggregated SearchACM Transactions on Information Systems10.1145/332212837:3(1-34)Online publication date: 16-May-2019
  • Show More Cited By

Index Terms

  1. Ranking using multiple document types in desktop search

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    July 2010
    944 pages
    ISBN:9781450301534
    DOI:10.1145/1835449
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. desktop search
    2. human computation game
    3. information retrieval
    4. semi-structured document retrieval
    5. type prediction

    Qualifiers

    • Research-article

    Conference

    SIGIR '10
    Sponsor:

    Acceptance Rates

    SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)'It's on the tip of my tongue'Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498421(48-56)Online publication date: 11-Feb-2022
    • (2021)Improving Cloud Storage Search with User ActivityProceedings of the 14th ACM International Conference on Web Search and Data Mining10.1145/3437963.3441780(508-516)Online publication date: 8-Mar-2021
    • (2019)The Effects of Working Memory, Perceptual Speed, and Inhibition in Aggregated SearchACM Transactions on Information Systems10.1145/332212837:3(1-34)Online publication date: 16-May-2019
    • (2019)Clarifying False Memories in Voice-based SearchProceedings of the 2019 Conference on Human Information Interaction and Retrieval10.1145/3295750.3298961(331-335)Online publication date: 8-Mar-2019
    • (2019)The ubiquitous digital fileJournal of the Association for Information Science and Technology10.1002/asi.2422271:1(E1-E32)Online publication date: 4-Dec-2019
    • (2017)Aggregated SearchFoundations and Trends in Information Retrieval10.1561/150000005210:5(365-502)Online publication date: 6-Mar-2017
    • (2016)HIA'16Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2917760(1241-1241)Online publication date: 7-Jul-2016
    • (2016)Efficient distributed selective searchInformation Retrieval Journal10.1007/s10791-016-9290-620:3(221-252)Online publication date: 25-Nov-2016
    • (2015)Distributed Information Retrieval: Developments and StrategiesInternational Journal of Engineering Research in Africa10.4028/www.scientific.net/JERA.16.11016(110-144)Online publication date: Jun-2015
    • (2015)HIA'15Proceedings of the Eighth ACM International Conference on Web Search and Data Mining10.1145/2684822.2697029(423-424)Online publication date: 2-Feb-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media