[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/636805.636819acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

A document-document similarity measure based on cited titles and probability theory, and its application to relevance feedback retrieval

Published: 02 July 1984 Publication History

Abstract

The use of cited title terms of a scientific document for automatic indexing is explored. It offers a means of index term selection as well as term relevance weighting, based on author-provided relevance information and Bayes Theorem as in probabilistic retrieval. The latter quantitative consideration leads to a new measure of document-document similarity measure which is shown to have importance both for initial search and in relevance feedback retrieval, by offering a choice of iterative strategies.Extension of the concept of cited title terms to citing title terms shows that these two approaches are compatible with the current two competing models of probability of relevance for document retrieval (Robertson <u>et al</u>. 1982), if a document can also be regarded as a query. Their term usage may therefore provide the necessary statistics for parameter estimation to test both theories.

References

[1]
Bookstein, A. & Swanson, D.R. (1975). A decision theoretic foundation for indexing. J. of ASIS, 26, 45--50.
[2]
Cooper, W.S. & Huizinga, P. (1982). The maximum entropy principle and its application to the design of probabilistic retrieval systems. Info. Tech.: R. & D., 1, 99--112.
[3]
Croft, W.B. (1980). A model of cluster searching based on classification. Info. Sys., 5, 189--195.
[4]
Harper, D.J. & van Rijsbergen, C.J. (1978). An evaluation of feedback in document retrieval using co-occurrence data. J. of Doc., 34, 189--216.
[5]
Kwok, K.L. (1974). Cited titles - a new source of keyword extraction for automatic document classification and retrieval. Proc. ASIS Mtg., 11, 56--57.
[6]
Kwok, K.L. (1975). The use of title and cited titles as document representation for automatic classification. Info. Proc. Mgmt., 11, 201--206.
[7]
Luhn, H.P. (1958). The automatic creation of literature abstracts. IBM J. of R. & D., 2, 159--165.
[8]
Maron, M.E. & Kuhns, J.L. (1960). On relevance, probabilistic indexing and information retrieval. J. ACM, 3, 216--244.
[9]
Oddy, R.N., Robertson, S.E., van Rijsbergen, C.J. & Williams, P.W. (1981). (editors) Information Retrieval Research, London: Butterworths.
[10]
Robertson, S.E. (1977). Theories and models in information retrieval. J. of Doc., 33, 126--148.
[11]
Robertson, S.E., Maron, M.E. & Cooper, W.S. (1982). Probability of relevance: a unification of two competing models for document retrieval. Info. Tech.: R. & D., 1, 1--21.
[12]
Robertson, S.E. & Sparck Jones, K. (1976). Relevance weighting of search terms. J. of ASIS, 27, 129--146.
[13]
Rocchio, J.J. Jr. (1971). Relevance feedback in information retrieval. In The SMART retrieval system - experiments in automatic document processing, ed. G. Salton, pp. 313--323. Englewood Cliffs, NJ: Prentice-Hall.
[14]
Salton, G. (1979). Mathematics and information retrieval. J. of Doc., 35, 1--29.
[15]
Sparck Jones, K. (1979). Experiments in relevance weighting of search terms. Info. Proc. Mgmt., 15, 133--144.
[16]
van Rijsbergen, C.J. (1977). A theoretical basis for the use of cooccurrence of data in information retrieval. J. of Doc., 33, 106--119.
[17]
Yu, C.T. & Salton, G. (1976). Precision weighting - an effective automatic indexing method. J. ACM, 23, 76--88.

Cited By

View all
  • (2010)Global ranking via data fusionProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944592(223-231)Online publication date: 23-Aug-2010
  • (2008)Global ranking using Continuous Conditional Random FieldsProceedings of the 22nd International Conference on Neural Information Processing Systems10.5555/2981780.2981940(1281-1288)Online publication date: 8-Dec-2008
  • (1998)Enhanced hypertext categorization using hyperlinksACM SIGMOD Record10.1145/276305.27633227:2(307-318)Online publication date: 1-Jun-1998
  • Show More Cited By
  1. A document-document similarity measure based on cited titles and probability theory, and its application to relevance feedback retrieval

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '84: Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval
    July 1984
    422 pages
    ISBN:0521268656

    Sponsors

    Publisher

    BCS Learning & Development Ltd.

    Swindon, United Kingdom

    Publication History

    Published: 02 July 1984

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 18 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2010)Global ranking via data fusionProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944592(223-231)Online publication date: 23-Aug-2010
    • (2008)Global ranking using Continuous Conditional Random FieldsProceedings of the 22nd International Conference on Neural Information Processing Systems10.5555/2981780.2981940(1281-1288)Online publication date: 8-Dec-2008
    • (1998)Enhanced hypertext categorization using hyperlinksACM SIGMOD Record10.1145/276305.27633227:2(307-318)Online publication date: 1-Jun-1998
    • (1998)Enhanced hypertext categorization using hyperlinksProceedings of the 1998 ACM SIGMOD international conference on Management of data10.1145/276304.276332(307-318)Online publication date: 1-Jun-1998

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media