[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/96749.98219acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

Using syntactic analysis in a document retrieval system that uses signature files

Published: 01 December 1989 Publication History

Abstract

Our work involves the study of the extent to which natural language processing techniques aid the automatic indexing and retrieval of documents. In this paper we describe the use of signature files in large text retrieval systems. We show that good performance can be obtained without requiring the significant overheads required for the inverted file technique. We examine the use of syntactic analysis of the text in all stages of retrieval and argue that an initial Boolean query should be performed that provides a subset of documents, which are then ranked. We then give an algorithm for generating such queries, taking into account the syntactic structure of the queries.

References

[1]
W.B. Croft, P. Savino Implementing Ranking Strategies Using Tezt #ignafsres ACM Transactions on Office Information Systems, VoI. 6, No. I, January 1988, pp. 42-62.
[2]
M- Dillon, A. Gray Fully Automatic Syntaz-based Indezing J. of the American Society for Information Science, Vol. 34, No. 2. March 1983, pp. 99-108
[3]
J.L. Fagan Ezperiments in Automatic Phrase Indezing for Docwment Retrieval: A Comparison of Syntactic Methods and Non-Syntactic Methods. Ph. D. Thesis, Cornell University, 1987.
[4]
C. Fedoutsc# Access Me#hods for Tezi ACM Computing Surveys, Vo|. 17, No. 1, March 1985 pp. 49-74.
[5]
A.J. Kent, R. Sacks-Davis, K. Ramamohanarao A Superimposed Coding Scheme Based on Multiple BlocL Description F#les for Indezing Very Large Databases In Proceedings of 14th. International Conference on Very Large Databases, August 1988, pp. 351-359.
[6]
A.J. Kent, Ft. Sacks-Davis, K. Ramamohanarao A Signature File Scheme Based on Multiple Organisations for Indezing Very Large Databases To appear: Journal of American Society for Information Science.
[7]
D.P. Metzler, S. W. Haas, C. L. C#ic, L. H. Wheeler Constituent Object Parsing for Information Retrieval and Similar Tezt Processing Problems Journal of the American Society for Information Science, Vol. 40, No. 6, 1989, pp. 398-423
[8]
P. Palmer, C. Berrut Definifion of a surface syntactical parser for naferal language. In Proceedings of ACSI, Montreal, 1985
[9]
.C.S. Roberts Partial Match Retrieval via the Method of Superimposed Codes Proceedings of the IEEE, Vol. 67, No. 12, 1979, pp. 1624-1642
[10]
R. Sacks-Davis, A. Kent, K. Ramamohanarao MultiLe# Access Methods llased on Superimposed Coding Techniques. ACM Transactions on Database Systems, Vol. 12, No. 4, December 1987, pp. 655-696.
[11]
G. Salton, E. A. Fox, E. Voorhees Advanced FeedbacL Mefhods in Information Retrieval. journal of the American Society for Information Science, Vol. 36, No. 3, 1985, pp. 200-210
[12]
G. Salton Automatic tezt processing. Addison-Wesley, Reading, Massachusetts, 1989
[13]
A.F. Smeaton Using Parsing of Natural Language Documents as part of Document Retrieval. Ph. D. Thesis, National University of Ireland, 1988

Cited By

View all
  • (1994)A methodology for exploiting sophisticated representations for classificationIntelligent Multimedia Information Retrieval Systems and Management - Volume 110.5555/2856823.2856827(21-33)Online publication date: 11-Oct-1994
  • (1994)Exploiting sophisticated representations for document retrievalProceedings of the fourth conference on Applied natural language processing10.3115/974358.974374(65-71)Online publication date: 13-Oct-1994
  • (1993)SIM: a Korean signature extraction methodProceedings of TENCON '93. IEEE Region 10 International Conference on Computers, Communications and Automation10.1109/TENCON.1993.320067(1110-1113)Online publication date: 1993
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '90: Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
December 1989
509 pages
ISBN:0897914082
DOI:10.1145/96749
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 1989

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGIR'90
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)9
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (1994)A methodology for exploiting sophisticated representations for classificationIntelligent Multimedia Information Retrieval Systems and Management - Volume 110.5555/2856823.2856827(21-33)Online publication date: 11-Oct-1994
  • (1994)Exploiting sophisticated representations for document retrievalProceedings of the fourth conference on Applied natural language processing10.3115/974358.974374(65-71)Online publication date: 13-Oct-1994
  • (1993)SIM: a Korean signature extraction methodProceedings of TENCON '93. IEEE Region 10 International Conference on Computers, Communications and Automation10.1109/TENCON.1993.320067(1110-1113)Online publication date: 1993
  • (1993)A signature file method for Korean text retrieval[1993] Proceedings IEEE International Conference on Developing and Managing Intelligent System Projects10.1109/DMISP.1993.248620(174-181)Online publication date: 1993
  • (1992)Retrieval activities in a database consisting of heterogeneous collections of structured textProceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/133160.133185(112-125)Online publication date: 1-Jun-1992

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media