[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Detection of corrupted schema mappings in XML data integration systems

Published: 14 October 2009 Publication History

Abstract

In modern data integration scenarios, many remote data sources are located on the Web and are accessible only through forms or Web services, and no guarantee is given about their stability. In these contexts the detection of corrupted mappings, as a consequence of a change in the source or in the target schema, is a key problem. A corrupted mapping fails in matching the target or the source schema, hence it is not able to transform data conforming to a schema S into data conforming to a schema T, nor it can be used for effective query reformulation.
This article describes a novel technique for maintaining schema mappings in XML data integration systems, based on a notion of mapping correctness relying on the denotational semantics of mappings.

Supplementary Material

Colazzo Appendix (a14-colazzo-apndx.pdf)
Online appendix to detection of corrupted schema mappings in XML data integration systems. The appendix supports the information on article 14.

References

[1]
Alexe, B., Tan, W. C., and Velegrakis, Y. 2008. Stbenchmark: Towards a benchmark for mapping systems. In Proceedings of the VLDB Endowment Archive 1, 1, 230--244.
[2]
Arenas, M. and Libkin, L. 2005. Xml data exchange: Consistency and query answering. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 13--24.
[3]
Benzaken, V., Castagna, G., Colazzo, D., and Nguyen, K. 2006. Type-Based xml projection. In Proceedings of the 32nd International Conference on Very Large Database. 271--282.
[4]
Bex, G. J., Neven, F., and den Bussche, J. V. 2004. Dtds versus xml schema: A practical study. In Proceedings of the 7th International Workshop on the Web and Database (WebDB), S. Amer-Yahia and L. Gravano Eds. 79--84.
[5]
Boag, S., Chamberlin, D., Fernandez, M. F., Florescu, D., Robie, J., and Siméon, J. 2007. XQuery 1.0: An XML query language. Tech. rep., World Wide Web Consortium. W3C Recommendation.
[6]
Böhm, K., Jensen, C. S., Haas, L. M., Kersten, M. L., Larson, P., and Ooi, B. C., Eds. 2005. In Proceedings of the 31st International Conference on Very Large Data Bases. ACM.
[7]
Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., and Summa, G. 2008. Schema mapping verification: The spicy way. In Proceedings of the International Conference on Extending Database Technology (EDBT), A. Kemper, Eds. ACM International Conference Proceeding Series, vol. 261. ACM, 85--96.
[8]
Buneman, P. and Pierce, B. C. 1999. Union types for semistructured data. In Proceedings of the International Conference on Database Programming Languages (DBPL), R. C. H. Connor and A. O. Mendelzon, Eds. Lecture Notes in Computer Science, vol. 1949. Springer, 184--207.
[9]
Chiticariu, L. and Tan, W. C. 2006. Debugging schema mappings with routes. In Proceedings of the 32nd International Conference on Very Large Database. 79--90.
[10]
Choi, B. 2002. What are real dtds like? In Proceedings of the 7th International Workshop on the Web and Database (WebDB). 43--48.
[11]
Colazzo, D. 2004. Path correctness for XML queries: Characterization and static type checking. Ph.D. thesis, Dipartimento di Informatica, Università di Pisa.
[12]
Colazzo, D., Ghelli, G., Manghi, P., and Sartiani, C. 2004. Types for path correctness of XML queries. In Proceedings of the International Conference on Functional Programming (ICFP).
[13]
Colazzo, D., Ghelli, G., Manghi., P., and Sartiani, C. 2006. Static analysis for path correctness of XML queries. J. Functional Program. 16, 4-5, 621--661.
[14]
Colazzo, D. and Sartiani, C. 2005. Typechecking queries for maintaining schema mappings in XML P2P databases. In Proceedings of the 3th Workshop on Programming Language Technologies for XML (Plan-X) (in conjunction with POPL'05).
[15]
Cosmo, R. D., Pottier, F., and Rémy, D. 2005. Subtyping recursive types modulo associative commutative products. In Proceedings of the International Conference on Typed Lamda Calculi and Applications (TLCA), P. Urzyczyn, Ed. Lecture Notes in Computer Science, vol. 3461. Springer, 179--193.
[16]
Dayal, U., Whang, K.-Y., Lomet, D. B., Alonso, G., Lohman, G. M., Kersten, M. L., Cha, S. K., and Kim, Y.-K., Eds. 2006. Proceedings of the 32nd International Conference on Very Large Data Bases. ACM.
[17]
den Bussche, J. V. and Vianu, V., Eds. 2001. In Proceedings of the 8th International Conference on Database Theory (ICDT 2001). Lecture Notes in Computer Science, vol. 1973. Springer.
[18]
Draper, D., Fankhauser, P., Fernandez, M., Malhotra, A., Rose, K., Rys, M., Siméon, J., and Wadler, P. 2007. XQuery 1.0 and XPath 2.0 formal semantics. Tech. rep., World Wide Web Consortium. W3C Recommendation.
[19]
Fernandez, M. F., Siméon, J., and Wadler, P. 2001. A semi-monad for semi-structured data. In Proceedings of the 8th International Conference on Database Theory (ICDT). 263--300.
[20]
Freytag, J. C., Lockemann, P. C., Abiteboul, S., Carey, M. J., Selinger, P. G., and Heuer, A., Eds. 2003. In Proceedings of the 29th International Conference on Very Large Data Bases. Morgan Kaufmann.
[21]
Friedman, M., Levy, A. Y., and Millstein, T. D. 1999. Navigational plans for data integration. In Proceedings of the CEUR Workshop on Intelligent Information Integration. CEUR, vol. 23.
[22]
Fuxman, A., Kolaitis, P. G., Miller, R. J., and Tan, W. C. 2005. Peer data exchange. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 160--171.
[23]
Goldberg, A. V. 1998. Recent developments in maximum flow algorithms. In Proceedings of the Scandinavian Workshop on Algorithm Theory (SWAT), S. Arnborg and L. Ivansson, Eds. Lecture Notes in Computer Science, vol. 1432. Springer, 1--10.
[24]
Halevy, A. Y., Ives, Z. G., Mork, P., and Tatarinov, I. 2003. Piazza: Data management infrastructure for semantic Web applications. In Proceedings of the 12th International World Wide Web Conference (WWW2003, Budapest). ACM, 556--567.
[25]
Hernández, M. A., Ho, H., Popa, L., Fuxman, A., Miller, R. J., Fukuda, T., and Papotti, P. 2007. Creating nested mappings with Clio. In Proceedings of the International Conference on Data Engineering (ICDE). IEEE, 1487--1488.
[26]
Hosoya, H. and Pierce, B. C. 2003. Xduce: A statically typed xml processing language. ACM Trans. Internet Techn. 3, 2, 117--148.
[27]
Huynh, D. T. 1985. The complexity of equivalence problems for commutative grammars. Inform. Control 66, 1/2, 103--121.
[28]
Kuper, G. M. and Siméon, J. 2001. Subsumption for xml types. In Proceedings of the 8th International Conference on Database Theory (ICDT). 331--345.
[29]
Kushmerick, N. 2000. Wrapper verification. World Wide Web 3, 2, 79--94.
[30]
Lerman, K., Minton, S., and Knoblock, C. A. 2003. Wrapper maintenance: A machine learning approach. J. Artif. Intell. Res. 18, 149--181.
[31]
Li, C., Ed. 2005. In Proceedings of the 24th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM.
[32]
Madhavan, J. and Halevy, A. Y. 2003. Composing mappings among data sources. In Proceedings of the 29th International Conference on Very Large Databases. 572--583.
[33]
Marian, A. and Siméon, J. 2003. Projecting xml documents. In Proceedings of the 29th International Conference on Very Large Databases. 213--224.
[34]
Mayer, A. J. and Stockmeyer, L. J. 1994. Word problems-This time with interleaving. Inform. Comput. 115, 2, 293--311.
[35]
McCann, R., AlShebli, B. K., Le, Q., Nguyen, H., Vu, L., and Doan, A. 2005. Mapping maintenance for data integration systems. In Proceedings of the 31st International Conference on Very Large Databases. 1018--1030.
[36]
Melnik, S., Rahm, E., and Bernstein, P. A. 2003. Rondo: A programming platform for generic model management. In SIGMOD Conference, A. Y. Halevy, Z. G. Ives, and A. Doan, Eds. ACM, 193--204.
[37]
Popa, L., Velegrakis, Y., Miller, R. J., Hernández, M. A., and Fagin, R. 2002. Translating Web data. In Proceedings of the Conference on Very Large Databases (VLDB). Morgan Kaufmann, 598--609.
[38]
Tatarinov, I. 2004. Semantic data sharing with a peer data management system. Ph.D. thesis, University of Washington.
[39]
Tatarinov, I. and Halevy, A. Y. 2004. Efficient query reformulation in peer-data management systems. In Proceedings of the SIGMOD Conference. 539--550.
[40]
Ullman, J. D. 1988. Principles of Database and Knowledge-Base Systems, Volume I. Computer Science Press.
[41]
Ullman, J. D. 1989. Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press.
[42]
Velegrakis, Y., Miller, R. J., and Popa, L. 2004. Preserving mapping consistency under schema changes. VLDB J. 13, 3, 274--293.
[43]
Yu, C. and Popa, L. 2005. Semantic adaptation of schema mappings when schemas evolve. In Proceedings of the 31st International Conference on Very Large Databases. 1006--1017.

Cited By

View all
  • (2018)State-of-the-art on mapping maintenance and challenges towards a fully automatic approachExpert Systems with Applications: An International Journal10.1016/j.eswa.2014.08.04742:3(1465-1478)Online publication date: 29-Dec-2018
  • (2011)Schemas for safe and efficient XML processingProceedings of the 2011 IEEE 27th International Conference on Data Engineering10.1109/ICDE.2011.5767960(1378-1379)Online publication date: 11-Apr-2011

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Internet Technology
ACM Transactions on Internet Technology  Volume 9, Issue 4
September 2009
165 pages
ISSN:1533-5399
EISSN:1557-6051
DOI:10.1145/1592446
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2009
Accepted: 01 May 2009
Revised: 01 April 2009
Received: 01 November 2008
Published in TOIT Volume 9, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. XML
  2. data exchange
  3. data integration
  4. mapping correctness
  5. p2p systems
  6. type inference
  7. type systems

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)State-of-the-art on mapping maintenance and challenges towards a fully automatic approachExpert Systems with Applications: An International Journal10.1016/j.eswa.2014.08.04742:3(1465-1478)Online publication date: 29-Dec-2018
  • (2011)Schemas for safe and efficient XML processingProceedings of the 2011 IEEE 27th International Conference on Data Engineering10.1109/ICDE.2011.5767960(1378-1379)Online publication date: 11-Apr-2011

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media