[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/564376.564480acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

K-tree/forest: efficient indexes for boolean queries

Published: 11 August 2002 Publication History

Abstract

In Information Retrieval it is well-known that the complexity of processing boolean queries depends on the size of the intermediate results, which could be huge (and are typically on disk) even though the size of the final result may be quite small. In the case of inverted files the most time consuming operation is the merging or intersection of the list of occurrences [1]. We propose, the Keyword tree (K-tree) and forest, efficient structures to handle boolean queries in keyword-based information retrieval. Extensive simulations show that K-tree is orders-of-magnitude faster (i.e., far fewer I/O's) for boolean queries than the usual approach of merging the lists of occurrences and incurs only a small overhead for single keyword queries. The K-tree can be efficiently parallelized as well. The construction cost of K-tree is comparable to the cost of building inverted files.

References

[1]
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.
[2]
Bruce Croft. The future of web search. In Invited Keynote Address at the 25th Australasian Computer Science Conference, Melbourne, Australia, 2002.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
August 2002
478 pages
ISBN:1581135610
DOI:10.1145/564376
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 August 2002

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. indexing
  2. keyword tree
  3. keyword-based retrieval
  4. parallelization

Qualifiers

  • Article

Conference

SIGIR02
Sponsor:

Acceptance Rates

SIGIR '02 Paper Acceptance Rate 44 of 219 submissions, 20%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 454
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media