Abstract
Queries over XML documents challenge search engines to return the most relevant XML components that satisfy the query concepts. In a previous work we described a component ranking algorithm that performed relatively well in INEX’03. In this paper we show an improvement to that algorithm by introducing a document pivot that compensates for missing terms statistics in small components. Using this new algorithm we achieved improvements of 30%-50% in the Mean Average Precision over the previous algorithm. We then describe a general mechanism to apply known Query Refinement algorithms from traditional IR on top of this component ranking algorithm and demonstrate an example such algorithm that achieved top results in INEX’04.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Broder, A.Z., Maarek, Y., Mandelbrod, M., Mass, Y.: Using XML to Query XML – From Theory to Practice. In: Proceedings of RIAO 2004, Avignon France (April 2004)
Carmel, D., Farchi, E., Petruschka, Y., Soffer, A.: Automatic Query Refinement using Lexical Affinities with Maximal Information Gain. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2002)
Carmel, D., Maarek, Y., Mandelbrod, M., Mass, Y., Soffer, A.: Searching XML Documents via XML Fragments. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada (August 2003)
INEX, Initiative for the Evaluation of XML Retrieval, http://inex.is.informatik.uni-duisburg.de
INEX 2004 Participants area, http://inex.is.informatik.uni-duisburg.de:2004/internal/
Mass, Y., Mandelbrod, M.: Retrieving the most relevant XML Component. In: Proceedings of the Second Workshop of the Initiative for The Evaluation of XML Retrieval (INEX), Schloss Dagstuhl, Germany, December 15-17, pp. 53–58 (2003)
Ruthven, I., Lalmas, M.: A survey on the use of relevance feedback for information access systems. Knowledge Engineering Review 18(1) (2003)
Salton, G.: Automatic Text Processing – The Transformation, Analysis and Retrieval of Information by Computer. Addison Wesley Publishing Company, Reading (1989)
Sigurbjornsson, B., Kamps, J., Rijke, M.: An element based approach to XML Retrieval. In: Proceedings of the Second Workshop of the Initiative for The Evaluation of XML Retrieval (INEX), Schloss Dagstuhl, Germany, December 15-17, pp. 19–26 (2003)
Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of SIGIR 1996, pp. 21–29 (1996)
XPath – XML Path Language (XPath) 2.0, http://www.w3.org/TR/xpath2
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mass, Y., Mandelbrod, M. (2005). Component Ranking and Automatic Query Refinement for XML Retrieval. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds) Advances in XML Information Retrieval. INEX 2004. Lecture Notes in Computer Science, vol 3493. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424550_6
Download citation
DOI: https://doi.org/10.1007/11424550_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26166-7
Online ISBN: 978-3-540-32053-1
eBook Packages: Computer ScienceComputer Science (R0)