[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Accelerating XPath evaluation in any RDBMS

Published: 01 March 2004 Publication History

Abstract

This article is a proposal for a database index structure, the XPath accelerator, that has been specifically designed to support the evaluation of XPath path expressions. As such, the index is capable to support all XPath axes (including ancestor, following, preceding-sibling, descendant-or-self, etc.). This feature lets the index stand out among related work on XML indexing structures which had a focus on the child and descendant axes only. The index has been designed with a close eye on the XPath semantics as well as the desire to engineer its internals so that it can be supported well by existing relational database query processing technology: the index (a) permits set-oriented (or, rather, sequence-oriented) path evaluation, and (b) can be implemented and queried using well-established relational index structures, notably B-trees and R-trees.We discuss the implementation of the XPath accelerator on top of different database backends and show that the index performs well on all levels of the memory hierarchy, including disk-based and main-memory based database systems.

Supplementary Material

PDF File (p91-grust-app.pdf)
Article appendix

References

[1]
Altinel, M. and Franklin, M. J. 2000. Efficient filtering of XML documents for selective dissemination of information. In Proceedings of the 26th International Conference on Very Large Databases (VLDB) (Cairo, Egypt). Morgan-Kaufmann, San Francisco, Calif., 53--64.]]
[2]
Berglund, A., Boag, S., Chamberlin, D., Fernandez, M. F., Kay, M., Robie, J., and Siméon, J. 2002. XML Path Language (XPath) 2.0. Tech. Rep. W3C Working Draft, Version 2.0, World Wide Web Consortium. Aug. http://www.w3.org/TR/xpath20/.]]
[3]
Boag, S., Chamberlin, D., Fernandez, M., Florescu, D., Robie, J., and Siméon, J. 2002. XQuery 1.0: An XML Query Language. Tech. Rep. W3C Working Draft, World Wide Web Consortium. Aug. http://www.w3.org/TR/xquery.]]
[4]
Böhm, C., Berchtold, S., Kriegel, H.-P., and Michel, U. 2000. Multidimensional index structures in relational databases. J. Intel. Inf. Syst. (JIIS) 15, 1, 51--70.]]
[5]
Boncz, P. A. 2002. Monet: A next-generation DBMS kernel for query-intensive applications. Ph.D. dissertation. University of Amsterdam, The Netherlands.]]
[6]
Boncz, P. A. and Kersten, M. L. 1999. MIL primitives for querying a fragmented world. The VLDB J. 8, 2, 101--119.]]
[7]
Chen, Z., Jagadish, H., Korn, F., Koudas, N., Muthukrishnan, S., Ng, R., and Srivastava, D. 2001. Counting twig matches in a tree. In Proceedings of the 17th International Conference on Data Engineering (ICDE) (Heidelberg, Germany). IEEE Computer Society Press, Los Alamitos, Calif., 595--604.]]
[8]
Cohen, E., Kaplan, H., and Milo, T. 2002. Labeling dynamic XML trees. In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS) (Madison, Wisc.). ACM, New York, 271--121.]]
[9]
Cooper, B. F., Sample, N., Franklin, M. J., Hjaltason, G. R., and Shadmon, M. 2001. A fast index for semistructured data. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB) (Rome, Italy). Morgan-Kaufmann, San Francisco, Calif., 341--360.]]
[10]
Dietz, P. F. and Sleator, D. D. 1987. Two algorithms for maintaining order in a list. In Conference Record of the 19th Annual ACM Symposium on Theory of Computing (STOC). ACM, New York, 365--372.]]
[11]
Fernandez, M., Marsh, J., and Nagy, M. 2002. XQuery 1.0 and XPath 2.0 Data Model. Tech. Rep. W3C Working Draft, World Wide Web Consortium. Aug. http://www.w3.org/TR/ query-datamodel.]]
[12]
Florescu, D. and Kossmann, D. 1999. A performance evaluation of alternative mapping schemes for storing XML data in a relational database. Tech. Rep. 3680. INRIA, Rocquencourt, France. May.]]
[13]
Goldman, R. and Widom, J. 1997. DataGuides: Enabling query formulation and optimization in semistructured databases. In Proceedings of the 23rd International Conference on Very Large Databases (VLDB) (Athens, Greece). Morgan-Kaufmann, San Francisco, Calif., 436--445.]]
[14]
Gottlob, G., Koch, C., and Pichler, R. 2002. Efficient algorithms for processing XPath queries. In Proceedings of the 28th International Conference on Very Large Data Bases (VLDB) (Hong Kong, China). Morgan-Kaufmann, San Francisco, Calif., 95--106.]]
[15]
Grust, T. 2002. Accelerating XPath location steps. In Proceedings of the 21st International ACM SIGMOD Conference on Management of Data (Madison, Wisc.). ACM, New York, 109--120.]]
[16]
Grust, T. and van Keulen, M. 2003. Tree awareness for relational database kernels: Staircase join. In Intelligent Search on XML, H. Blanken, H.-J. Schek, and G. Weikum, Eds. Lecture Notes in Computer Science, vol. 2818. Springer-Verlag, Heidelberg, Germany.]]
[17]
Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In SIGMOD 1984, Proceedings of Annual Meeting (Boston, Mass.). ACM, New York, 47--57.]]
[18]
Hellerstein, J. M., Naughton, J. F., and Pfeffer, A. 1995. Generalized search trees for database systems. In Proceedings of the 21st International Conference on Very Large Databases (VLDB) (Zurich, Switzerland). Morgan-Kaufmann, San Francisco, Calif., 562--573.]]
[19]
Kamel, I. and Faloutsos, C. 1993. On packing R-trees. In Proceedings of the 2nd International Conference on Information and Knowledge Management (CIKM) (Washington D.C.). ACM, New York, 490--499.]]
[20]
Kaushik, R., Bohannon, P., Naughton, J. F., and Korth, H. K. 2002. Covering indexes for branching path Queries. In Proceedings of the 21st International ACM SIGMOD Conference on Management of Data (Madison, Wisc.). ACM, New York, 133--144.]]
[21]
Kriegel, H.-P., Pötke, M., and Seidl, T. 2000. Managing intervals efficiently in object-relational databases. In Proceedings of the 26th International Conference on Very Large Databases (VLDB) (Cairo, Egypt). Morgan-Kaufmann, San Francisco, Calif., 407--418.]]
[22]
Li, Q. and Moon, B. 2001. Indexing and querying XML data for regular path expressions. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB) (Rome, Italy). Morgan-Kaufmann, San Francisco, Calif., 361--370.]]
[23]
Olteanu, D., Meuss, H., Furche, T., and Bry, F. 2001. Symmetry in XPath. Tech. Rep. PMS-FB-2001-16. Institute of Computer Science, University of Munich, Munich, Germany.]]
[24]
Roussopoulos, N. and Leifker, D. 1985. Direct spatial search on pictorial databases using packed R-trees. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Austin, Tex.). ACM, New York, 17--31.]]
[25]
SAX (Simple API for XML). http://sax.sourceforge.net/.]]
[26]
Schmidt, A., Waas, F., Kersten, M., Carey, M. J., Manolescu, I., and Busse, R. 2002. XMark: A Benchmark for XML Data Management. In Proceedings of the 28th International Conference on Very Large Databases (VLDB) (Honk Kong, China). Morgan-Kaufmann, San Francisco, Calif., 974--985.]]
[27]
Shanmugasundaram, J., Tufte, K., He, G., Zhang, C., DeWitt, D., and Naughton, J. 1999. Relational databases for querying XML documents: Limitations and opportunities. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB) (Edinburgh, Scotland). Morgan-Kaufmann, San Francisco, Calif., 302--314.]]
[28]
Suciu, D. and Milo, T. 1999. Index structures for path expressions. In Proceedings of the 7th International Conference on Database Theory (ICDT) (Jerusalem, Israel). Lecture Notes in Computer Science, vol. 1540. Springer-Verlag, New York, 277--295.]]
[29]
Wu, Y., Patel, J. M., and Jagadish, H. 2002. Estimating answer sizes for XML queries. In Proceedings of the 8th International Conference on Extending Database Technology (EDBT) (Prague, Czech Republic). Springer-Verlag, New York, 590--608.]]
[30]
Zhang, C., Naughton, J., DeWitt, D., Luo, Q., and Lohman, G. 2001. On supporting containment queries in relational database management systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Santa Barbara, Calif.). ACM, New York, 425--436.]]

Cited By

View all
  • (2023)AN IMPROVED INDEXING METHOD FOR QUERYING BIG XML FILESJournal of Computer Science and Cybernetics10.15625/1813-9663/19018(323-342)Online publication date: 25-Dec-2023
  • (2016)Improving the performance of processing recursive structures of XML path queries and data2016 International Conference on Control, Decision and Information Technologies (CoDIT)10.1109/CoDIT.2016.7593556(176-181)Online publication date: Apr-2016
  • (2016)Dynamic class hierarchy management for multi-version ontology-based personalizationJournal of Computer and System Sciences10.1016/j.jcss.2015.06.00182:1(69-90)Online publication date: 1-Feb-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 29, Issue 1
March 2004
232 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/974750
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2004
Published in TODS Volume 29, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Main-memory databases
  2. XML
  3. XML indexing
  4. XPath

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)AN IMPROVED INDEXING METHOD FOR QUERYING BIG XML FILESJournal of Computer Science and Cybernetics10.15625/1813-9663/19018(323-342)Online publication date: 25-Dec-2023
  • (2016)Improving the performance of processing recursive structures of XML path queries and data2016 International Conference on Control, Decision and Information Technologies (CoDIT)10.1109/CoDIT.2016.7593556(176-181)Online publication date: Apr-2016
  • (2016)Dynamic class hierarchy management for multi-version ontology-based personalizationJournal of Computer and System Sciences10.1016/j.jcss.2015.06.00182:1(69-90)Online publication date: 1-Feb-2016
  • (2016)Descendants, ancestors, children and parentInformation Processing and Management: an International Journal10.1016/j.ipm.2015.11.00152:3(399-429)Online publication date: 1-May-2016
  • (2015)CoUPEProceedings of the 2015 IEEE International Congress on Big Data10.1109/BigDataCongress.2015.29(142-149)Online publication date: 27-Jun-2015
  • (2014)Scalable XPath Evaluation on Large-Scale Continuously Evolving XML RepositoriesProceedings of the 2014 IEEE International Congress on Big Data10.1109/BigData.Congress.2014.85(546-553)Online publication date: 27-Jun-2014
  • (2014)Semantic-based Structural and Content indexing for the efficient retrieval of queries over large XML data repositoriesFuture Generation Computer Systems10.1016/j.future.2014.02.01037(212-231)Online publication date: Jul-2014
  • (2013)A practical theory of language-integrated queryACM SIGPLAN Notices10.1145/2544174.250058648:9(403-416)Online publication date: 25-Sep-2013
  • (2013)SCISSORProceedings of the 22nd ACM international conference on Information & Knowledge Management10.1145/2505515.2505732(799-804)Online publication date: 27-Oct-2013
  • (2013)A practical theory of language-integrated queryProceedings of the 18th ACM SIGPLAN international conference on Functional programming10.1145/2500365.2500586(403-416)Online publication date: 25-Sep-2013
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media