Abstract
Nowadays everybody uses a variety of different systems managing similar information, for example in the home entertainment sector. Unfortunately, these systems are largely heterogeneous, mostly with respect to the data model but at least with respect to the schema, making synchronization and propagation of data a daunting task. Our goal is to cope with this situation in a best-effort manner. To meet this claim, we introduce a symmetric instance-level matching approach that allows to establish mappings without any user interaction, schema information or dictionaries and ontologies. In awareness of dealing with inexact and incomplete mappings, the quality of the propagation has to be quantified. For this purpose, different quality dimensions like accuracy or completeness are introduced. Additionally, visualizing the quality allows users to evaluate the performance of the data propagation process.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Do, H.H., Rahm, E.: COMA - A System for Flexible Combination of Schema Matching Approaches. In: Proceedings of the 28th VLDB Conference, Hong Kong, China, pp. 610–621 (2002)
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of the 27th VLDB Conference, Rome, Italy, pp. 49–58 (2001)
Milo, T., Zohar, S.: Using Schema Matching to Simplify Heterogeneous Data Translation. In: Proceedings of the 24th VLDB Conference, New York City, USA, pp. 122–133 (1998)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and its Application to Schema Matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), San Jose, USA, pp. 117–128 (2002)
Wang, Q.Y., Yu, J.X., Wong, K.-F.: Approximate graph schema extraction for semi-structured data. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 302–316. Springer, Heidelberg (2000)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proceedings of the 2rd VLDB Conference, Athens, Greece, pp. 436–445 (1997)
Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate Query Answering for a Heterogeneous XML Document Base. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K.G. (eds.) WISE 2004. LNCS, vol. 3306, pp. 337–351. Springer, Heidelberg (2004)
Rahm, E., Bernstein, P.A.: On Matching Schemas Automatically. Technical Report MSR-TR-2001-17, Microsoft Research, Microsoft Corporation, One Microsoft Way, Redmond, WA 98052-6399 (2001)
Bovee, M., Srivastava, R.P., Mak, B.: A Conceptual Framework and Belief-Function Approach to Assessing Overall Information Quality. International Journal of Intelligent Systems 18(1), 51–74 (2003)
Lee, Y.W., Strong, D.M., Kahn, B.K., Wang, R.Y.: AIMQ: A Methodology for Information Quality Assessment. Information & Management 40, 133–146 (2002)
Martinez, A., Hammer, J.: Making Quality Count in Biological Data Sources. In: Proceedings of the IQIS Workshop, Baltimore, USA, pp. 16–27 (2005)
Motro, A., Rakov, I.: Estimating the Quality of Databases. In: Proceedings of the 3rd FQAS Conference, Roskilde, Denmark, pp. 298–307 (1998)
Naumann, F., Rolker, C.: Do Metadata Models meet IQ Requirements?. In: Proceedings of the 4th IQ Conference, Cambridge, USA, pp. 99–114 (1999)
Naumann, F., Rolker, C.: Assessment Methods for Information Quality Criteria. In: Proceedings of the 5th IQ Conference, Cambridge, USA, pp. 148–162 (2000)
Scannapieco, M., Missier, P., Batini, C.: Data Quality at a Glance. Datenbank-Spektrum 14, 6–14 (2005)
Tayi, G.K., Ballou, D.P.: Examining Data Quality - Introduction. Communications of the ACM 41(2), 54–57 (1998)
Pipino, L., Lee, Y.W., Wang, R.Y.: Data Quality Assessment. Communications of the ACM 45(4), 211–218 (2002)
Naumann, F.: From Databases to Information Systems - Information Quality Makes the Difference. In: Proceedings of the 6th IQ Conference, Cambridge, USA, pp. 244–260 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rösch, P. (2006). Tolerant Ad Hoc Data Propagation with Error Quantification. In: Grust, T., et al. Current Trends in Database Technology – EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 4254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11896548_3
Download citation
DOI: https://doi.org/10.1007/11896548_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46788-5
Online ISBN: 978-3-540-46790-8
eBook Packages: Computer ScienceComputer Science (R0)