[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/383952.383964acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Meta-scoring: automatically evaluating term weighting schemes in IR without precision-recall

Published: 01 September 2001 Publication History

Abstract

In this paper, we present a method that can automatically evaluate performance of different term weighting schemes in information retrieval without resorting to precision-recall based on human relevance judgments. Specifically, the problem is: given two document-term matrixes generated from two different term weighting schemes, can we tell which term weighting scheme will performance better than the other? We propose a meta-scoring function, which takes as input the document-term matrix generated by some term weighting scheme and computes a goodness score from the document-term matrix. In our experiments, we found out that this score is highly correlated with the precision-recall measurement for all the collections and term weighting schema we tried. Thus, we conclude that our meta-scoring function can be a substitute for the precision-recall measurement that needs relevance judgments of human subject. Furthermore, this meta-scoring function is not limited only to text information retrieval can be applied to fields such as image and DNA retrieval.

References

[1]
K. Sparck Jones and P. Willett. Reading in Information Retrieval. Chap. 3, 305-312, Morgan Kaufmann Publishers, San Francisco, CA, 1997.
[2]
C. Buckley, A. Singhal, and M. Mitra. New retrieval approaches using SMART. In D. K. Harmann, editor, Proceedings of the Fourth Text Retrieval Conference (TREC-4), Gaithersburg, 1996.
[3]
S.E. Roberson and S. Walker, Okapi/Keenbow at TREC-8. In E.M. Voorhees and D.K. Harmann, editor, Proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, 2000.
[4]
K. Sparck Jones and C. van Rijsbergen. Report on the need for and provision of an "ideal" information retrieval test collection. British Library Research and Development Report 5266, Computer Laboratory, University of Cambridge, 1975.
[5]
S. Mizzaro, Measuring the Agreement Among Relevance Judges, MIRA99, Glasgow, UK, 1999.
[6]
G. Salton and C. Buckley, Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 24, 513-523, 1988.
[7]
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6): 391-407, Sept. 1990.
[8]
K.L. Kwok, L. Grunfeld and J.H. Xu, TREC-6 English and Chinese Retrieval Experiments using PRICS. In E. M. Voorhees and D. K. Harmann, editor, Proceedings of the Sixth Text Retrieval Conference (TREC-6), Gaithersburg, 1997.
[9]
T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley & Sons, New York, 1991.
[10]
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, 1993.
[11]
C.H. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala. Latent Semantic Indexing: A Probabilistic Analysis. In Proocedings of the ACM Conference, 1999.

Cited By

View all
  • (2020)Improving Online Clustering of Chinese Technology Web News With Bag-of-Near-SynonymsIEEE Access10.1109/ACCESS.2020.29955168(94245-94257)Online publication date: 2020
  • (2018)Experimental analysis of impact of term weighting schemes on cluster qualityInternational Journal of Advanced Intelligence Paradigms10.5555/3192120.319213210:1-2(178-193)Online publication date: 1-Jan-2018
  • (2016)Intelligent tagging of online texts using fuzzy logic2016 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI.2016.7849917(1-8)Online publication date: Dec-2016
  • Show More Cited By

Index Terms

  1. Meta-scoring: automatically evaluating term weighting schemes in IR without precision-recall

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
        September 2001
        454 pages
        ISBN:1581133316
        DOI:10.1145/383952
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 01 September 2001

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Article

        Conference

        SIGIR01
        Sponsor:

        Acceptance Rates

        SIGIR '01 Paper Acceptance Rate 47 of 201 submissions, 23%;
        Overall Acceptance Rate 792 of 3,983 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)7
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 02 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2020)Improving Online Clustering of Chinese Technology Web News With Bag-of-Near-SynonymsIEEE Access10.1109/ACCESS.2020.29955168(94245-94257)Online publication date: 2020
        • (2018)Experimental analysis of impact of term weighting schemes on cluster qualityInternational Journal of Advanced Intelligence Paradigms10.5555/3192120.319213210:1-2(178-193)Online publication date: 1-Jan-2018
        • (2016)Intelligent tagging of online texts using fuzzy logic2016 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI.2016.7849917(1-8)Online publication date: Dec-2016
        • (2016)Term frequency with average term occurrences for textual information retrievalSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-015-1935-720:8(3045-3061)Online publication date: 1-Aug-2016
        • (2015)Non-Compositional Term Dependence for Information RetrievalProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767717(595-604)Online publication date: 9-Aug-2015
        • (2014)A new weighting scheme and discriminative approach for information retrieval in static and dynamic document collections2014 14th UK Workshop on Computational Intelligence (UKCI)10.1109/UKCI.2014.6930160(1-8)Online publication date: Sep-2014
        • (2014)Query-performance prediction for effective query routing in domain-specific repositoriesJournal of the Association for Information Science and Technology10.1002/asi.2307265:8(1597-1614)Online publication date: 1-Aug-2014
        • (2013)Effective measures for inter-document similarityProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2505526(1361-1370)Online publication date: 27-Oct-2013
        • (2013)From Ambiguous Words to Key-Concept ExtractionProceedings of the 2013 24th International Workshop on Database and Expert Systems Applications10.1109/DEXA.2013.16(63-67)Online publication date: 26-Aug-2013
        • (2013)Clustering Short-Text Using Non-negative Matrix Factorization of Hadamard Product of SimilaritiesInformation Retrieval Technology10.1007/978-3-642-45068-6_13(145-155)Online publication date: 2013
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media