sPLMap: A Probabilistic Approach to Schema Matching

Henrik Nottelmann¹⁸ &
Umberto Straccia¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3408))

Included in the following conference series:

European Conference on Information Retrieval

4670 Accesses
14 Citations

Abstract

This paper introduces the first formal framework for learning mappings between heterogeneous schemas which is based on logics and probability theory. This task, also called “schema matching”, is a crucial step in integrating heterogeneous collections. As schemas may have different granularities, and as schema attributes do not always match precisely, a general-purpose schema mapping approach requires support for uncertain mappings, and mappings have to be learned automatically. The framework combines different classifiers for finding suitable mapping candidates (together with their weights), and selects that set of mapping rules which is the most likely one. Finally, the framework with different variants has been evaluated on two different data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

YAM: A Step Forward for Generating a Dedicated Schema Matcher

Database Schema Matching Using Machine Learning with Feature Selection

A study on machine learning techniques for the schema matching network problem

Article Open access 23 November 2021

References

Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems 19(2), 97–130 (2001)
Article Google Scholar
Dhamankar, R., Lee, Y., Doan, A., Halevy, A., Domingos, P.: iMAP: Discovering complex semantic matches between database schemas. In: SIGMOD 2004 (2004)
Google Scholar
Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: A machine-learning approach. In: SIGMOD Conference (2001)
Google Scholar
Doan, A., Madhavan, J., Dhamankar, R., Domingos, P., Halevy, A.: Learning to match ontologies on the semantic web. VLDB Journal (2004)
Google Scholar
Fagin, R., Kolaitis, P.G., Tan, W.-C., Popa, L.: Composing schema mappings: Second-order dependencies to the rescue. In: Proceedings PODS 2004 (2004)
Google Scholar
Friedman, M., Levy, A.Y., Millstein, T.D.: Navigational plans for data integration. In: Proceedings of 16th Natl Conf. on Artificial Intelligence, pp. 67–73 (1999)
Google Scholar
Fuhr, N.: Towards data abstraction in networked information retrieval systems. Information Processing and Management 35(2), 101–119 (1999)
Article Google Scholar
Fuhr, N.: Probabilistic Datalog: Implementing logical information retrieval for advanced applications. Journal of the American Society for Information Science 51(2), 95–110 (2000)
Article MathSciNet Google Scholar
He, B., Chang, K.C.-C.: Statistical schema matching across web query interfaces. In: Papakonstantinou, et al. (eds.) [13]
Google Scholar
Kang, J., Naughton, J.F.: On schema matching with opaque column names and data values. In: Papakonstantinou, et al. (eds.) [13]
Google Scholar
Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS 2002), pp. 233–246. ACM Press, New York (2002)
Chapter Google Scholar
Nottelmann, H., Fuhr, N.: Learning probabilistic Datalog rules for information classification and transformation. In: Paques, H., Liu, L., Grossman, D. (eds.) Proceedings of the 10th International Conference on Information and Knowledge Management, pp. 387–394. ACM, New York (2001)
Google Scholar
Papakonstantinou, Y., Halevy, A., Ives, Z. (eds.): Proceedings SIGMOD 2003 (2003)
Google Scholar
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. The VLDB Journal 10(4), 334–350 (2001)
Article MATH Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics and Interactive Systems, University of Duisburg-Essen, 47048, Duisburg, Germany
Henrik Nottelmann
ISTI-CNR, Via G. Moruzzi 1, 56124, Pisa, Italy
Umberto Straccia

Authors

Henrik Nottelmann
View author publications
You can also search for this author in PubMed Google Scholar
Umberto Straccia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Electrónica y Computación, Universidad de Santiago de Compostela, Spain
David E. Losada
Departamento de Ciencias de la Computación e Inteligencia Artificial E.T.S.I. Informática y de Telecomunicación, Universidad de Granada, 18071, Granada, Spain
Juan M. Fernández-Luna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nottelmann, H., Straccia, U. (2005). sPLMap: A Probabilistic Approach to Schema Matching. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-31865-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25295-5
Online ISBN: 978-3-540-31865-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics