Abstract
Most inferential approaches to Information Retrieval (IR) have been investigated within the probabilistic framework. Although these approaches allow one to cope with the underlying uncertainty of inference in IR, the strict formalism of probability theory often confines our use of knowledge to statistical knowledge alone (e.g. connections between terms based on their co-occurrences). Human-defined knowledge (e.g. manual thesauri) can only be incorporated with difficulty. In this paper, based on a general idea proposed by van Rijsbergen, we first develop an inferential approach within a fuzzy modal logic framework. Differing from previous approaches, the logical component is emphasized and considered as the pillar in our approach. In addition, the flexibility of a fuzzy modal logic framework offers the possibility of incorporating human-defined knowledge in the inference process. After defining the model, we describe a method to incorporate a human-defined thesaurus into inference by taking user relevance feedback into consideration. Experiments on the CACM corpus using a general thesaurus of English, Wordnet, indicate a significant improvement in the system's performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bookstein, A. (1983). Outline of a General Probabilistic Retrieval Model. Journal of Documentation 39(2): 63–72.
Buell, D. A. (1982) An Analysis of Some Fuzzy Subset: Applications to Information Retrieval Systems. Fuzzy Sets and Systems 7: 35–42.
Buell, D. A. & Kraft, D. H. (1981) A Model for a Weighted Retrieval System. Journal of the American Society for Information Science 32: 211–216.
Chellas, B. F. (1980). Modal logic—An Introduction. Cambridge University Press: Cambridge.
Chen, H. & Dhar, V. (1991). Cognitive Process As a Basis for Intelligent Retrieval System Design. Information Processing & Management 27(5): 405–432.
Chen, H., Lynch, K. J., Basu, K. & Ng, D. (1993). Generating, Integrating and Activating Thesauri for Concept-Based Document Retrieval. IEEE Expert Intelligent Systems & their Applications 8(2): 25–34.
Chiaramella, Y. & Nie, J.-Y. (1989). A Retrieval Model Based on an Extended Modal Logic and Its Application to the RIME Experimental Approach. Research and Development on Information Retrieval-ACM-SIGIR Conference, 25–43, Brussels.
Cooper, W. S. (1995). Some Inconsistencies and Misidentified Modeling Assumptions in Probabilistic Information Retrieval. ACM Transactions on Information Systems 13(1): 100–111.
Croft, W. B. (1987). Approaches to Intelligent Information Retrieval. Information Processing & Management 23(4): 249–254.
Dubois, D. & Prade, H. (1984). Fuzzy Logics and the Generalized Modus Ponens Revisited. Cybernetics and Systems: An International Journal 15: 293–331.
Fox, E. A. (1983). Characterization of Two Experimental Collections in Computer and Information Science. Cornell University, Department of Computer Science, Technical Report TR 83–561, September.
Fox, E. E. (1980). Lexical Relations: Enhancing Effectiveness of Information Retrieval Systems. Sigir Forum 15(3): 6–35.
Frikes, W. B. & Baeza-Yates, R. (ed.) (1992). Information Retrieval: Data Structures & Algorithms. Prentice-Hall: Englewood Cliffs, N.J.
Fuhr, N. (1992). Probabilistic Models in Information Retrieval. The Computer Journal 35(3): 243–255.
Grefenstette, G. (1992). Use of Syntactic Context to Produce Term Association Lists. 15th ACM-SIGIR Conference, 89–97.
Güntzer, V., Jüttner, S. G. & Sarre, F. (1989). Automatic Thesaurus Construction by Machine Learning from Retrieval Sessions. Information Processing & Management 25(3): 265–273.
Hancock-Beaulieu, M. & Walker, S. (1992). An Evaluation of Automatic Query Expansion in an Online Library Catalogue. Journal of Documentation 48(4): 406–421.
Hearst, M. A. (1992). Automatic Acquisition of Hyponyms from Large Text Corpora. Fourteenth International Conference on Computational Linguistics COLING'92.
Hindle, D. (1989). Acquiring Disambiguation Rules from Text. 27th Annual Meeting of the Association for Computational Linguistics, 118–125, Pittsburgh.
Kim, Y. W. & Kim, J. H. (1990). A Model of Knowledge Based Information Retrieval with Hierarchical Concept Graph. Journal of Documentation 46(2): 113–136.
Kimoto, H. & Iwaderie, T. (1990). Construction of a Dynamic Thesaurus and Its Use for Associated Information Retrieval. 13th ACM-SIGIR Conference, 227–240.
Kraft, D. H. & Buell, D. A. (1983). Fuzzy Sets and Generalized Boolean Retrieval Systems. International Journal on Man-Machine Studies 19: 49–56.
Lee, J. H., Kim, M. H. & Lee, Y. J. (1993). Information Retrieval Based on Conceptual Distance in IS-A Hierarchies. Journal of Documentation 49: 188–207.
Lee, J. H., Kim, M. H. & Lee, Y. J. (1994). Ranking Documents in Thesaurus-Based Boolean Retrieval Systems. Information Processing & Management 30(1): 79–91.
Lu, X. (1990). Document Retrieval: A Structure Approach. Information Processing & Management 26(2): 209–218.
Maron, M. & Kuhns, J. (1960). On Relevance, Probabilistic Indexing and Information Retrieval. Journal of the ACM 7: 216–244.
Miller, G. (ed.) (1990). Wordnet: An On-Line Lexical Database.
Miyamoto, S. (1990). Information Retrieval Based on Fuzzy Associations. Fuzzy Sets and Systems 38: 191–205.
Nie, J.-Y. (1989). An Information Retrieval Model Based on Modal Logic. Information Processing & Management 25(5): 477–491.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann: San Mateo CA.
Peat, H. J. & Willett, P. (1991). The Limitation of Term Co-Occurence Data for Query Expansion in Document Retrieval Systems. Journal of the American Society for Information Science 42(5): 378–383.
Qiu, Y. & Frei, H. P. (1993). Concept Based Query Expansion. Research and Development in Information Retrieval, ACM-SIGIR, 160–169.
Rada, R., Barlow, J., Potharst, J., Zanstra, P. & Bijstra, D. (1991). Document Ranking Using an Enriched Thesaurus. Journal of Documentation 47: 240–253.
Rada, R., Mili, H., Bicknell, E. & Blettner, M. (1989). Development and Application of a Metric on Semantic Nets. IEEE Transaction on Systems, Man, and Cybernetics 19(1): 17–30.
Radecki, T. (1979). Fuzzy Set Theoretical Appraoch to Document Retrieval. Information Processing & Management 15: 247–259.
Rijsbergen, C.J.v. (1977). A Theoretical Basis for the Use of Co-Ocurrence Data in Information Retrieval. Journal of Documentation 33: 106–119.
Rijsbergen, C.J.v. (1979). Information Retrieval, 2nd ed. Butterworths: London.
Rijsbergen, C. J. v. (1986). A Non-Classical Logic for Information Retrieval. The Computer Journal 29(6): 481–485.
Rijsbergen, C. J. v. (1989). Towards an Information Logic. Research and Development on Information Retrieval-ACM-SIGIR, 77–86.
Robertson, S., Maron, M. & Cooper, W. (1982). Probability of Relevance: a Unification of Two Competing Models for Document Retrieval. Information Technology: Research and Development 1: 1–21.
Salton, G. & Buckley, C. (1988). On the Use of Spreading Activation Methods in Automatic Information Retrieval. 11th ACM-SIGIR Conference.
Salton, G. & McGill, M. J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill.
Schotch, P.K. (1975). Fuzzy Modal Logic. International Symposium on Multiple-Valued Logic, 176–182. Indiana University, Bloomington.
Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford University Press: Oxford.
Sparck-Jones, K. (1991). Notes and References on Early Automatic Classification Work. SIGIR Forum 25(1): 10–17.
Thompson, P. (1988). Subjective Probability and Information Retrieval: A Review of the Psychological Literature. Journal of Documentation 44(2): 119–143.
Turtle, H. & Croft, W. B. (1990). Inference Network for Document Retrieval. Research and Development on Information Retrieval-ACM-SIGIR, Brussels.
Voorhees, E. M. (1993). Using Wordnet to Disambiguate Word Senses for Text Retrieval. Research and Development on Information Retrieval-ACM-SIGIR, Pittsburgh.
Voorhees, E. M. (1994). Query Expansion Using Lexical-Semantic Relations. Research and Development on Information Retrieval-ACM-SIGIR, 61–70, Dublin.
Waller, W. G. & Kraft, D. H. (1979). A Mathematical Model for a Weighted Boolean Retrieval System. Information Processing & Management 15: 235–245.
Wong, S. K. M. & Yao, Y. Y. (1991). A Probabilistic Inference Model for Information Retrieval. Information Systems 16(3): 301–321.
Ying, M. S. (1988). On Standard Models of Fuzzy Modal Logics. Fuzzy Sets and Systems 26: 357–363.
Zadeh, L. A. (1983). The Role of Fuzzy Logic in the Management of Uncertainty in Expert Systems. Fuzzy Sets and Systems 11: 199–227.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Nie, JY., Brisebois, M. An inferential approach to Information Retrieval and its implementation using a manual thesaurus. Artif Intell Rev 10, 409–439 (1996). https://doi.org/10.1007/BF00130693
Issue Date:
DOI: https://doi.org/10.1007/BF00130693