[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Relevance feedback revisited: dealing with content and structure in XML documents

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Relevance feedback (RF) is a technique that allows to enrich an initial query according to the user feedback. The goal is to express more precisely the user’s needs. Some open issues arise when considering semi-structured documents like XML documents. They are mainly related to the form of XML documents which mix content and structure information and to the new granularity of information. Indeed, the main objective of XML retrieval is to select relevant elements in XML documents instead of whole documents. Most of the RF approaches proposed in XML retrieval are simple adaptation of traditional RF to the new granularity of information. They usually enrich queries by adding terms extracted from relevant elements instead of terms extracted from whole documents. In this article, we describe a new approach of RF that takes advantage of two sources of evidence: the content and the structure. We propose to use the query term proximity to select terms to be added to the initial query and to use generic structures to express structural constraints. Both sources of evidence are used in different combined forms. Experiments were carried out within the INEX evaluation campaign and results show the effectiveness of our approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Belkin N.J.: Anomalous states of knowledge as a basis for information retrieval. Can. J. Inform. Sci. 5, 133–143 (1980)

    Google Scholar 

  2. Campbell, I.: Supporting information needs by ostensive definition in an adaptive information space. In: MIRO’95. Electronic Workshops in Computing. Springer, Berlin (1995)

  3. Campbell I.: Interactive evaluation of the ostensive model, using a new test-collection of images with multiple relevance assessments. J. Inf. Retriev. 2(1), 89–114 (1999)

    Article  Google Scholar 

  4. Campbell, I., Rijsbergen, C.J.V.: Ostensive model of information needs. In: Proceedings of the Second International Conference on Conceptions of Library and Information Science: Integration in Perspective (CoLIS 2), pp. 251–268 (1996)

  5. Crouch, C.J., Apte, S., Bapat, H.: An approach to structured retrieval based on the extended vector model. In: Proceedings of INEX 2003 Workshop, pp. 89–93. Germany, December (2003)

  6. Crouch, C., Mahajan, A., Bellamkonda, A.: Flexible XML retrieval based on the vector space model. In: INEX 2004 Workshop Proceedings, pp. 292–302. Germany, December (2004)

  7. Denoyer, L., Gallinari, P.: A belief networks-based generative model for structured documents. An application to the xml categorization. In: MLDM, pp. 328–342 (2003)

  8. Denoyer L., Gallinari P.: The wikipedia xml corpus. SIGIR Forum 40(1), 64–69 (2006)

    Article  Google Scholar 

  9. Efthimiadis E.: Interactive query expansion: a user based evaluation in relevance feedback environment. J. Am. Soc. Inform. Sci. 51(11), 989–1003 (2000)

    Article  Google Scholar 

  10. Ellis D.: A behavioural approach to information system design. J. Doc. 45(3), 171–212 (1989)

    Article  Google Scholar 

  11. Fuhr, N., Govert, N., Kazai, G., Lalmas, M.: In: Proceedings of the First Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2002) (2002)

  12. Fuhr, N., Lalmas, M., Malik, S.: In: INEX 2003 Workshop Proceedings (2003)

  13. Fuhr, N., Lalmas, M., Malik, S., Szlavik, Z.: In: INEX 2004 Workshop Proceedings. Springer, Berlin (2004)

  14. Fuhr, N., Lalmas, M., Malik, S., Kazai, G.: In: INEX 2005 Workshop Proceedings (2005)

  15. Fuhr, N., Lalmas, M., Trotman, A.: In: INEX 2006 Workshop Proceedings (2006)

  16. Geva, S.: Gpx-gardens point xml information retrieval at inex 2004. In: INEX 2004 Workshop Proceedings, pp. 211–223. Dagsthul, Germany, December (2004)

  17. Geva, S.: Gpx-gardens point xml ir at inex 2006. In: Comparative Evaluation of XML Information Retrieval Systems, pp. 137–150. Dagstuhl, Germany, December (2006)

  18. Grabs, T., Schek, H.: Eth zurich at inex, flexible information retrieval from xml with powerdb-xml. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML retrieval (INEX), pp. 141–148. Dagsthul, Germany, December (2002)

  19. Harman, D.: Towards interactive query expansion. In: 11th Annual International ACM SIGIR Conference on Research and Developement in Information Retrieval, pp. 321–331 (1988)

  20. Hatano, K., Kinutani, H., Watanabe, M.: Determining the unit of retrieval results for xml document retrieval. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML Retrieval (INEX), pp. 57–64. Dagsthul, Germany, December (2002)

  21. Hatano, K., Kinutani, H., Watanabe, M., Mori, Y., Yoshikawa, M., Uemura, S.: Keyword-based xml fragment retrieval: experimental evaluation based on inex 2003 relevance assessments. In: Proceedings of INEX 2003 Workshop, pp. 81–88. Dagsthul, Germany, December (2003)

  22. Hlaoua, L., Sauvagnat, K.: Structure-oriented relevance feedback in xml retrieval. In: InSciT2006. Merida, Espagna, October (2006)

  23. Hlaoua, L., Sauvagnat, K., Boughanem, M.: A structure-oriented relevance feedback method for xml retrieval. In: Proceedings of the 15th ACM Annual Conference on Information and Knowlege Management CIKM’06, Arlington, November (2006)

  24. Hlaoua, L., Torjmen, M., Pinel-Sauvagnat, K., Boughanem, M.: XFIRM at INEX 2006. Ad-hoc, relevance feedback and multimedia tracks. In: International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX), Dagstuhl, Allemagne, 18/12/2006–20/12/2006 (2006)

  25. Hlaoua, L., Boughanem, M., Pinel-Sauvagnat, K.: Combination of evidences in relevance feedback for XML retrieval. In Conference on Information and Knowledge Management (CIKM), Lisbonne, Portugal, November (2007)

  26. Hlaoua, L., Boughanem, M., Pinel-Sauvagnat, K.: Using a content-and-structure oriented method for relevance feedback in XML retrieval. In: Large-Scale Semantic Access to Content (Text, Image, Video and Sound) (RIAO), Pittsburgh (PA) États-Unis, 30/05/2007–01/06/2007, June (2007)

  27. Hlaoua, L., Pinel-Sauvagnat, K., Boughanem, M.: Relevance feedback for XML retrieval: using structure and content to expand queries. In: Rolland, C., Pastor, O., Cavarero, J.-L. (eds.) International Conference on Research Challenges in Information Science (RCIS), Ouarzazate-Maroc, 23/04/2007–26/04/2007, pp. 195–202, April (2007)

  28. Hubert, G.: A voting method for XML retrieval. In: Fuhr, N., Lalmas M., Malik, S. (eds.) Advances in XML Information Retrieval: Third International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2004, LNCS 3493/2005, Dagstuhl, Germany, pp. 183–196. Springer, Heidelberg, May (2005)

  29. Kazai, G., Lalmas, M.: Inex 2005 evaluation metrics. In: INEX 2005 Workshop Proceedings, pp. 401–406. Germany, November (2005)

  30. Kuhlthau C.: Principle for uncertainty for information seeking. J. Doc. 49(4), 339–355 (1993)

    Article  Google Scholar 

  31. Larson, R.: Cheshire ii at inex: using a hybrid logistic regression and boolean model for xml retrieval. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML Retrieval (INEX), pp. 18–25. Dagsthul, Germany, December (2002)

  32. List, J.A., Mihajlovic, V., de Vries, A.P., Ramirez, G.: The TIJAH XML-IR system at INEX 2003. In: Proceedings of INEX 2003 Workshop, pp. 102–109. Dagsthul, Germany, December (2003)

  33. Mass, Y., Mandelbrod, M.: Retrieving the most relevant XML components. In: Proceedings of INEX 2003 Workshop, pp. 53–58. Dagsthul, Germany, December (2003)

  34. Mass, Y., Mandelbrod, M.: Relevance feedback for XML retrieval. In: INEX 2004 Workshop Proceedings, pp. 303–310. Germany, December (2004)

  35. Mass, Y., Mandelbrod, M., Amitay, E., Maarek, Y., Soffer, A.: JuruXML-an XML retrieval system at INEX’02. In: Proceedings of the First Workshop of the Iniative for the Evaluation of XML Retrieval(INEX), pp. 73–80. Dagsthul, Germany, December (2002)

  36. Mihajlovic, V., Ramirez, G., de Vries, A., Hiemstra, D., Blok, H.: TIJAH at INEX 2004 modeling phrases and relevance feedback. In: INEX 2004 Workshop Proceedings, pp. 276–291. Germany, December (2004)

  37. Mihajlovic, V., Ramirez, G., Westerveld, T., Block, H., de Vries, A., Hiemstra, D.: TIJAH scratches INEX 2005 vague element selection, overlap, image search, relevance feedback, and users. In: INEX 2005 Workshop Proceedings, pp. 54–71. Dagsthul, Germany, November (2005)

  38. Olgilvie, P., Callan, J.: Using language models for flat text queries in xml retrieval. In: Proceedings of INEX 2003 Workshops, pp. 12–18. Dagsthul, Germany, December (2002)

  39. Pinel-Sauvagnat K., Boughanem M., Chrisment C.: Answering content-and-structure-based queries on XML documents using relevance propagation. Inf. Syst. (Special Issue SPIRE 2004) 31, 621–635 (2006)

    Google Scholar 

  40. Robertson S., Sparck-Jones J.K.: Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27(3), 129–146 (1976)

    Article  Google Scholar 

  41. Rocchio, J.: Relevance feedback in information retrieval. In: The SMART Retrieval System-Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall Inc., Englewood Cliffs (1971)

  42. Ruthven I., Lalmas M.: A survey on the use of relevance feedback for information access systems. Knowl Eng Rev 18(2), 95–145 (2003)

    Article  Google Scholar 

  43. Ruthven I., Lalmas M., Rijsbergen K.: Combining and selecting characteristics of information use. JASIST 53(5), 378–396 (2002)

    Article  Google Scholar 

  44. Sauvagnat, K., Boughanem, M.: The impact of leaf nodes relevance values evaluation in a propagation method for xml retrieval. In: Third XML and Information Retrieval Workshop, SIGIR 2004. Sheffield, UK, July (2004)

  45. Sauvagnat, K., Boughanem, M., Chrisment, C.: Searching XML documents using relevance propagation. In: SPIRE 04. Padoue, Italie, October (2004)

  46. Sauvagnat, K., Hlaoua, L., Boughanem, M.: Xfirm at inex 2005: ad-hoc and relevance feedback track. In: INEX 2005 Workshop Proceedings, pp. 72–83. Germany, November (2005)

  47. Schenkel, R., Theobald, M.: Relevance feedback for structural query expansion. In: INEX 2005 Workshop Proceedings, pp. 260–272. Germany, November (2005)

  48. Schenkel, R., Theobald, M.: Feedback-driven structural query expansion for ranked retrieval of xml data. In: EDBT, pp. 331–348 (2006)

  49. Schenkel, R., Theobald, M.: Structural feedback for keyword-based retrieval. In: Advances in Information Retrieval, 28th European Conference on IR Research, ECIR 2006, pp. 326–337. London, UK, April (2006)

  50. Sigurbjörnsson, B., Kamps, J., de Rijke, M.: An element-based approach to XML retrieval. In: Proceedings of INEX 2003 Workshop. Dagstuhl, Germany, December (2003)

  51. Spink, A., Wilson, T.D.: Toward a theoretical framework for information retrieval (ir) evaluation in an information seeking context. In: Mira’99: Evaluating Information Retrieval (1999)

  52. Trotman, A., Lalmas, M.: Why structural hints in queries do not help xml-retrieval. In: SIGIR, pp. 711–712 (2006)

  53. Trotman, A., Sigurbjornsson, B.: Narrowed extended xpath i(nexi). In: INEX 2004 Workshop Proceedings, pp. 16–40. Germany, December (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karen Pinel-Sauvagnat.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hlaoua, L., Pinel-Sauvagnat, K. & Boughanem, M. Relevance feedback revisited: dealing with content and structure in XML documents. Int J Digit Libr 11, 1–24 (2010). https://doi.org/10.1007/s00799-010-0061-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-010-0061-5

Keywords

Navigation