[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/215206.215329acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

Fast evaluation of structured queries for information retrieval

Published: 01 July 1995 Publication History

Abstract

Information retrieval systems are being challenged to manage larger and larger document collections. In an effort to provide better retrieval performance on large collections, more sophisticated retrieval techniques have been developed that support rich, structured queries. Structured queries are not amenable to previously proposed optimization techniques. Optimizing execution, however, is even more important in the context of large document collections. We present a new structured query optimization technique which we have implemented in an inference network-based information retrieval system. Experimental results show that query evaluation time can be reduced by more than half with little impact on retrieval effectiveness.

References

[1]
E. W. Brown, J. P. Callan, and W. B. Croft. Fast incremental indexing for full-text information retrieval, in Proc. of the 20th Inter. Conf. on Very Large Databases (VLDB), pages 192-202, Santiago, Sept. 1994.
[2]
E. W. Brown, J. E Callan, W. B. Croft, and J. E. B. Moss. Supporting full-text information retrieval with a persistent object store. In Proc. of the 4th Inter. Conf. on Extending Database Technology (EDBT), pages 365-378, Cambridge, UK, Mar. 1994
[3]
C. Buckley and A. E Lewit. Optimization of inverted vector searches. In Proc. of the 8th Inter. A CM SIGIR Conf. on Research and Development in Information Retrieval, pages 97-110, June 1985.
[4]
J. E Callan, W. B. Croft, and S. M. Harding. The INQUERY retrieval system. In Proc. of the 3rd Inter. Conf. on Database and Expert Systems Applications, Sept. 1992.
[5]
C. Faloutsos. Access methods for text. A CM Comput. Surv., 17:50-74, 1985.
[6]
D. Harman, editor. The Second Text REtrieval Conference (TREC2), Gaithersburg, MD, 1994. National Institute of Standards and Technology Special Publication 500-215.
[7]
D. Harman and G. Candela. Retrieving records from a gigabyte of text on a minicomputer using statistical ranking. J. Amer. Soc. Inf. Sci., 41 (8):581-589, Dec. 1990.
[8]
D. Harman, E. Fox, R. Baeza-Yates, and W. Lee. Inverted files. In W. B. Frakes and R. Baeza-Yates, editors, Information Retrieval: Data Structures & Algorithms, chapter 3, pages 28--43. Prentice Hall, Englewood Cliffs, NJ, 1992.
[9]
Y. Jing and W. B. Croft. An association thesaurus for information retrieval. In Proc. of RIAO 94 Conf., pages 146-I 60, New York, Oct. 1994.
[10]
D. Knaus and E Sch~iuble. Effective and efficient retrieval from large and dynamic document collections. In Harman {6}, pages 163-170.
[11]
D. Lucarella. A document retrieval system based on nearest neighbour searching. J. Inf. Sci., 14(t):25-33, 1988.
[12]
A. Moffat and J. Zobel. Fast ranking in limited space. In Proc. l Oth IEEE Inter. Conf. on Data Engineering, pages 428-437, Feb. 1994.
[13]
A. Moffat and J. Zobel. Self-indexing inverted files for fast text retrieval. Technical Report 94/2, Collaborative Information Technology Research institute, Department of Computer Science, Royal Melbourne Institute of Technology, Australia, Feb. 1994.
[14]
J. E. B. Moss. Design of the Mneme persistent object store. ACM Trans. Inf Syst., 8(2):103-139, Apr. 1990.
[15]
S. A. Perry and P. Willett. A review of the use of inverted files for best match searching in information retrieval systems. J. Inf. Sci., 6(2-3):59--66, 1983.
[16]
M. Persin. Document filtering for fast ranking. In Proc. of the 17th Inter ACM SIG1R Conf. on Research and Development in Information Retrieval, pages 339-348, Dublin, July t994.
[17]
E Sch~iuble. SPIDER: A multiuser information retrieval system for semistructured and dynamic data. In Proc of the 16th Inter. A CM SIGIR Conf on Research and Development in hformation Retrteval, pages 318-327, Pittsburgh, June I993.
[18]
A. E Smeaton and C. J. van Rijsbergen. The nearest neighbour problem in information retrieval. An algorithm using upperbounds. In Proc. of the 4th hzter. A CM SIGIR Cot~ on Research and Development in h~formatton Retrieval, pages 83-87, Oakland, CA, 1981.
[19]
H. R. Turtle and W. B. Croft. Evaluation of an inference network-based retrieval model A CM Trans. lnf Syst. 9(3): 187-222, July 1991.
[20]
W. Y. E Wong and D. L. Lee. Implementations of partial document ranking using inverted files. Inf. Process. & Mgmnt., 29(5):647-669, 1993.
[21]
G. K. Zipf. Human Behavior and the Principle of Least Effort. Addison-Wesley Press, 1949.

Cited By

View all
  • (2020)Using an Inverted Index Synopsis for Query Latency and Performance PredictionACM Transactions on Information Systems10.1145/338979538:3(1-33)Online publication date: 18-May-2020
  • (2020)Examining the Additivity of Top-k Query Processing InnovationsProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412000(1085-1094)Online publication date: 19-Oct-2020
  • (2015)Scalability Challenges in Web Search EnginesSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00662ED1V01Y201508ICR0457:6(1-138)Online publication date: 29-Dec-2015
  • Show More Cited By

Index Terms

  1. Fast evaluation of structured queries for information retrieval

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '95: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
    July 1995
    392 pages
    ISBN:0897917146
    DOI:10.1145/215206
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • German Comp Soc: GI - Gesellshaft for Informatik
    • CEPIS: Council of European Professional Informatics Societies
    • AICA: Assoc Italianai de Calcolo Automatico
    • IPSJ: Information Processing Society of Japan
    • DD
    • SIGIR: ACM Special Interest Group on Information Retrieval
    • BCS-ISRG: BCS-ISRG
    • BCS-IRSG: BCS/Information Retrieval Specialist Group

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 July 1995

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    SIGIR95
    Sponsor:
    • German Comp Soc
    • CEPIS
    • AICA
    • IPSJ
    • DD
    • SIGIR
    • BCS-ISRG
    • BCS-IRSG

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Using an Inverted Index Synopsis for Query Latency and Performance PredictionACM Transactions on Information Systems10.1145/338979538:3(1-33)Online publication date: 18-May-2020
    • (2020)Examining the Additivity of Top-k Query Processing InnovationsProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412000(1085-1094)Online publication date: 19-Oct-2020
    • (2015)Scalability Challenges in Web Search EnginesSynthesis Lectures on Information Concepts, Retrieval, and Services10.2200/S00662ED1V01Y201508ICR0457:6(1-138)Online publication date: 29-Dec-2015
    • (2015)Selective SearchACM Transactions on Information Systems10.1145/273803533:4(1-33)Online publication date: 23-Apr-2015
    • (2014)Faster MaxScore Document Retrieval with Aggressive ProcessingWeb-Age Information Management10.1007/978-3-319-08010-9_1(1-4)Online publication date: 2014
    • (2013)Exploring the magic of WANDProceedings of the 18th Australasian Document Computing Symposium10.1145/2537734.2537744(58-65)Online publication date: 5-Dec-2013
    • (2013)Fast candidate generation for real-time tweet search with bloom filter chainsACM Transactions on Information Systems10.1145/2493175.249317831:3(1-36)Online publication date: 5-Aug-2013
    • (2013)RASIMWorld Wide Web10.1007/s11280-012-0159-316:2(111-139)Online publication date: 1-Mar-2013
    • (2012)Efficient in-memory top-k document retrievalProceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval10.1145/2348283.2348317(225-234)Online publication date: 12-Aug-2012
    • (2011)Efficiency optimizations for interpolating subqueriesProceedings of the 20th ACM international conference on Information and knowledge management10.1145/2063576.2063625(297-306)Online publication date: 24-Oct-2011
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media