[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

A comparative analysis of methodologies for database schema integration

Published: 11 December 1986 Publication History

Abstract

One of the fundamental principles of the database approach is that a database allows a nonredundant, unified representation of all data managed in an organization. This is achieved only when methodologies are available to support integration across organizational and application boundaries.
Methodologies for database design usually perform the design activity by separately producing several schemas, representing parts of the application, which are subsequently merged. Database schema integration is the activity of integrating the schemas of existing or proposed databases into a global, unified schema.
The aim of the paper is to provide first a unifying framework for the problem of schema integration, then a comparative review of the work done thus far in this area. Such a framework, with the associated analysis of the existing approaches, provides a basis for identifying strengths and weaknesses of individual methodologies, as well as general guidelines for future improvements and extensions.

References

[1]
AL-FEDAGHi, S., AND SCHEUERMANN, P. 1981. Mapping considerations in the design of schemas for the relational model. IEEE Trans. So{tw. Eng. SE-7, I (Jan.).
[2]
BATINI, C., AND LENZERINI, M. 1984. A methodology for data schema integration in the entity relationship model. IEEE Trans. Softw. Eng. SE~IO, 6 (Nov.), 650-663.
[3]
CASANOVA, M., AND VIDAL, M. 1983. Towards a sound view integration methodology. In Proceedings of the 2nd ACM SIGACT/SIGMOD Conference on Principles of Database Systems (Atlanta, Ga., Mar. 21-23). ACM, New York, pp. 36-47.
[4]
DAYAL, U., AND HWANO, H. 1984. View definition and generalization for database integration in multibase: A system for heterogeneous distributed databases. IEEE Trans. Softw. Eng. SE-I 0, 6 (Nov.), 628-644.
[5]
ELMASRI, R., LARSON, J., AND NAVATHE, S. B. 1987. Integration algorithms for federated databases and logical database design. Tech. Rep., Honeywell Corporate Research Center (submitted for publication).
[6]
KAHN, B. 1979. A structured logical data base design methodology. Ph.D. dissertation, Computer Science Dept., Univ. of Michigan, Ann Arbor, Mich.
[7]
MANNINO, M. V., AND EFFELSBERG, W. 1984a. A methodology for global schema design, Computer and Information Sciences Dept., Univ. of Florida, Tech. Rep. No. TR-84-1, Sept.
[8]
MOTRO, A., AND BUNEMAN, P. 1981. Constructing superviews. In Proceedings of the international Conference on Management of Data (Ann Arbor, Mich., Apr. 29-May 1). ACM, New York.
[9]
NAVATHE, S. B., AND GADGIL, S. G. 1982. A methodology for view integration in logical data base design. In Proceedings of the 8th International Conference on Very Large Data Bases (Mexico City). VLDB Endowment, Saratoga, Calif.
[10]
TEORE~, T., AND FRY, J. 1982. Design of Database Structures. Prentice-Hall, Englewood Cliffs, N.J.
[11]
WIEDERHOLD, G., AND ELMASm, R. 1979. A structural model for database systems. Rep. STAN- CS-79-722, Computer Science Dept., Stanford Univ., Stanford, Calif.
[12]
YAO, S. B., WADDLE, V., AND HOUSEL, B. 1982. View modeling and integration using the functional data model. IEEE Trans. Softw. Eng. SE- 8, 6, 544-553.
[13]
ALBANO, A., CARDELLI, L., AND ORSINI, R. 1985. Galileo: A strongly typed, interactive conceptual language. A CM Trans. Database Syst. 10, 2 (June), 230-260.
[14]
ATZENI, P., AUSIELLO, G., BATINI, C., AND MOSCAR- INI, M. 1982. Inclusion and equivalence between relational database schemata. Theor. Comput. Sci. 19, 267-285.
[15]
BATINI, C., AND LENZERINI, M. 1983. A conceptual foundation to view integration. In Proceedings of the IFIP TC.2 Working Conference on System Description Methodologies (Kecskmet, Hungary). Elsevier, Amsterdam, pp. 109-139.
[16]
BATINI, C., LENZERINI, M., AND MOSCARINI, M. 1983. Views integration. In Methodology and Tools for Data Base Design, S. Ceri, Ed. North- Holland, Amsterdam.
[17]
BATINI, C., DEMO, B., AND DI LEVA, A. 1984. A methodology for conceptual design of office data bases. Inf. Syst. 9, 3, 251-263.
[18]
BATINI, C., NARDELLI, E., AND TAMASSIA, R. 1986. A layout algorithm for data flow diagrams. IEEE Trans. Softw. Eng. SE-12, 4 (Apr.), 538-546.
[19]
BEERI, C., BERNSTEIN, P., AND GOODMAN, N. 1978. A sophisticate's introduction to database normalization theory. In Proceedings of the 4th International Conference on Very Large Data Bases (West Berlin, Sept. 13-15). IEEE, New York.
[20]
BERNSTEIN, P. A. 1976. Synthesizing third normal form relations from functional dependencies. ACM Trans. Database Syst. 1, 4 (Dec.), 277-298.
[21]
BILLER, H. 1979. On the equivalence of data base schemas: A semantic approach to data translation. Inf. Syst. 4, 1, 35-47.
[22]
BILLER, H., AND NEUHOLD, E. J. 1982. Concepts for the conceptual schema. In Architecture and Models in Data Base Management Systems, G. M. Nijssen, Ed. North Holland, Amsterdam, pp. 1-30.
[23]
BISKUP, J., AND CONVENT, B. 1986. A formal view integration method. In Proceedings of the International Conference on the Management of Data (Washington, D.C., May 28-30). ACM, New York.
[24]
BISKUP, J., DAYAL, U., AND BERNSTEIN, P. A. 1979. Independent database schemas. In Proceedings of the International Conference on the Management of Data (Boston, Mass., May 30- June 1). ACM, New York.
[25]
BOUZEGHOUB, M., GARDARIN, G., AND METAIS, E. 1986. Database design tools: An expert systems approach. In Proceedings of 11th International Conference of Very Large Databases (Stockholm, Sweden). Morgan Kaufmann, Los Altos, Calif.
[26]
BRODIE, M. L. 1981. On modelling behavioural semantics of data. In Proceedings of the 7th International Conference on Very Large Data Bases (Cannes, France, Sept. 9-11). IEEE, New York, pp. 32-41.
[27]
BRODIE, M. L., AND ZILLES, S. N., Eos. 1981. In Proceedings of the Workshop on Data Abstraction, Databases, and Conceptual Modelling. SIGPLAN Not. 16, 1 (Jan.).
[28]
CARSWELL, J. L., AND NAVATHE, S. B. 1986. SA-ER: A methodology that links structured analysis and entity relationship modeling for database design. In Proceedings of the 5th International Conference on the Entity Relationship Approach, S. Spaccapietra, Ed. (Dijon, France, Nov.), pp. 19-36.
[29]
CERI, S., ED. 1983. Methodology and Tools for Database Design. North-Holland, Amsterdam.
[30]
CERI, S., AND PELA(~ATrI, G. 1984. Distributed Databases: Principles and Systems. McGraw-Hill, New York.
[31]
CERI, S., PELAGATTI, G., AND BRACCHI, G. 1981. A structured methodology for designing static and dynamic aspects of data base applications. Inf. Syst. 6, 1, 31-45.
[32]
CHEN, P. P. 1976. The entity-relationship model-- Toward a unified view of data. A CM Trans. Database Syst. 1, 1 (Mar.), 9-36.
[33]
CHEN, P. P. 1983. English sentence structure and entity-relationship diagrams. J. Inf. Sci. 29, 127-150.
[34]
CHIANG, W., BASAR, E., LIEN, C., AND TEiCHROEW, D. 1983. Data modeling with PSL/PSA: The view integration system (VIS). ISDOS Rep. No. M0549-0, Ann Arbor, Mich.
[35]
CHILSON, D., AND KUDLAC, C. 1983. Database design: A survey of logical and physical design techniques. Database 15, I (Fall).
[36]
DATA DESIGNER 1981. Data designer product description. Database Design Inc., Ann Arbor, Mich.
[37]
DEMO, B. 1983. Program analysis for conversion from a navigation to a specification database interface. In Proceedings of the 9th International Conference on Very Large Data Bases (Florence, Italy). VLDB Endowment, Saratoga, Calif.
[38]
DEMO, B., AND KUNOU, S. 1985. Modeling the CO- DASYL DML execution context dependency for application program conversion. In Proceedings of the International Conference on Management of Data (Austin, Tx., May 28-30). ACM, New York, pp. 354-363.
[39]
DOS SANTOS, C. S., NEUHOLD, E. J., AND FURTADO, A. L. 1980. A data type approach to the entity relationship model. In Proceedings of the International Conference on the Entity Relationship Approach to System Analysis and Design, P. Chen, Ed. (Los Angeles, 1979). North-Holland, Amsterdam, pp. 103-120.
[40]
EICK, C. F., AND LOCKEMANN, P. C. 1985. Acquisition of terminological knowledge using database design techniques. In Proceedings of the International Conference on Management of Data (Austin, Tx., May 28--30). ACM, New York, pp. 84-94.
[41]
ELMASRI, R. 1980. On the design, use and integration of data models. Ph.D. dissertation, Pep. No. STAN-CS-80-801, Dept. of Computer Science, Stanford Univ., Stanford, Calif.
[42]
ELMASRI, R., AND NAVATHE, $. B. 1984. Object integration in database design. In Proceedings of the IEEE COMPDEC Conference (Anaheim, Calif., Apr.). IEEE, New York, pp. 426-433.
[43]
ELMASRI, R., AND WIEDERHOLD, G. 1979. Data model integration using the structural model. In Proceedings of the International Con{erence on Management o{ Data (Boston, Mass., May 30- June 1). ACM, New York.
[44]
ELMASRI, R., WEELDRYER, J., AND H~.VNER, A. 1985. The category concept: An extension to the entity-relationship model. Data Knowl. Eng. 1, 1 (June).
[45]
FERRARA, F. M. 1985. EASY-ER: An integrated system for the design and documentation of data base applications. In Proceedings of the 4th International Conference on the Entity Relationship Approach (Chicago, Ill.). IEEE Computer Society, Silver Spring, Md., pp. 104-113.
[46]
HAMMER, M., AND McLEOD, D. 1981. Database description with SDM: A semantic database model. ACM Trans. Database Syst. 6, 3 (Sept.), 351-386.
[47]
HUBBARD, G. 1980. Computer Assisted Data Base Design. Van Nostrand-Reinhold, New York.
[48]
HWANO, H. Y. 1982. Database integration and optimization in multidatabase systems. Ph.D. dissertation, Dept. of Computer Science, Univ. of Texas, Austin, Oct.
[49]
KLUG, A., AND TSICHRITZIS, D., Eds. 1977. The ANSI/X3/SPARC Report of the Study Group on Data Base Management Systems. AFIPS Press, Reston, Va.
[50]
LANDERS, T. A., *NO ROSENnER(;, R. L. 1982. An overview of Multibase. In Distributed Databases, H. J. Schneider, Ed. North-Holland, Amsterdam.
[51]
LARSON, J., NAVATHE, S. B., AND ELMASRI, R. 1986. Attribute equivalence and its role in schema integration. Tech. Rep., Honeywell Computer Sciences Center, Golden Valley, Minn.
[52]
LUM, V., GHOSH, S., SCHKOLNiCK, M., jEFFERSON, D., Su, S., FRY, J.,NO YAO, B. 1979. 1978 New Orleans data base design workshop. In Proceedings of the 5th International Conference on Very Large Data Bases (Rio de Janeiro, Oct. 3-5). IEEE, New York, pp. 328-339.
[53]
MAIER, D. 1983. The Theory of Relational Databases. Computer Science Press, Potomac, Md.
[54]
MANNINO, M. V., AND EFFELSBERG, W. 1984b. Matching techniques in global schema design. In Proceedings of the IEEE COMPDEC Conference (Los Angeles, Calif.). IEEE, New York, pp. 418-425.
[55]
MANNINO, M. V., AND KARLE, C. 1986. An extension of the general entity manipulator language for global view definition. Data Knowl. Eng. 2, 1.
[56]
MANNINO, M. V., NAVATHE, S. B., AND EFFELSBERG, W. 1986. Operators and rules for merging generalization hierarchies. Working Paper, Graduate School of Business, Univ. of Texas, Austin, April 1986.
[57]
MCLEOD, D., AND HEIMBIGNER, D. 1980. A federated architecture for data base systems. In Proceedings of the AFIPS National Computer Con{erence, vol. 39. AFIPS Press, Arlington, Va.
[58]
MOTRO, A. 1981. Virtual merging of databases. Ph.D. dissertation, Tech. Rep. #MS-CIS-80-39, Computer Science Dept., Univ. of Pennsylvania, Philadelphia, Pa. 1981.
[59]
MYLOPOULOS, J., BERNSTEIN, P. A., AND WONG, H. K.T. 1980. A language facility for designing database-intensive applications. ACM Trans. Database Syst. 5, 2 (June) 185-207.
[60]
NATIONAL BUREAU OF STANDARDS 1982. Data base directions: Information resource managementstrategies and tools. Special Publ. 500-92, A. Goldfine, Ed. U.S. Dept. of Commerce, Washington, D.C., Sept. 1982.
[61]
NAVATHE, S.B., AND SCHKOLNICK, M. 1978. View representation in logical database design. In Proceedings of the International Conference on Management of Data (Austin, Tex.). ACM, New York, pp. 144-156.
[62]
NAVATHE, S. B., AND KERSCHnERC, L. 1986. Role of data dictionaries in information resource management. Inf. Manage. 10, 1.
[63]
NAVATHE, S. B., SASHIDHAR, T., AND ELMASRI, R. 1984. Relationship matching in schema integration. In Proceedings of the l Oth International Conference on Very Large Data Bases (Singapore). Morgan Kaufmann, Los Altos, Calif.
[64]
NAVATHE, S. B., ELMASRI, R., AND LARSON, J. 1986. Integrating user views in database design. IEEE Computer 19, 1 (Jan.), 50-62.
[65]
NG, P., JAJODIA, S., AND SPRINGSTEEL, F. 1983. The problem of equivalence of entity relationship diagrams. IEEE Trans. So{tw. Eng. SE-9, 5, 617-630.
[66]
OLLE, T. W., SOL, H. G., AND VERRIJN-STUART, A. A., Eds. 1982. Information systems design methodologies: A comparative review. In Proceedings o{ the IFIP WG 8.1 Working Conference on Comparative Review of Information Systems Design Methodologies (Noordwijkerhout, The Netherlands). North-Holland, Amsterdam.
[67]
RISSANEN, J. 1977. Independent components of relations. ACM Trans. Database Syst. 2, 4 (Dec.), 317-325.
[68]
ROLLAND, C., AND RICHARDS, C. 1982. Transaction modeling. In Proceedings of the International Conference on Management of Data (Orlando, Fla., June 2-4). ACM, New York, pp. 265-275.
[69]
SAKAI, H. 1981. A method for defining information structures and transactions in conceptual schema design. In Proceedings of the 7th International Conference on Very Large Data Bases (Cannes, France, Sept. 9-11). IEEE, New York, pp. 225-234.
[70]
SCHEUERMANN, P., SCHIFFNER, G., AND WEBER, H. 1980. Abstraction capabilities and invariant properties modeling within the entity relationship approach. In Proceedings of the International Conference on Entity Relationship Approach to System Analysis and Design, P. Chen, Ed. (Los Angeles, 1979). North-Holland, Amsterdam.
[71]
SHIN, D. G., AND IRANI, K. B. 1985. Knowledgebased distributed database system design. In Proceedings of the International Conference on Management of Data (Austin, Tex., May 28-30). ACM, New York, pp. 95-105.
[72]
SHIPMAN, D. W. 1980. The functional data model and data language DAPLEX. ACM Trans. Database Syst. 6, i (Mar.), 140-173.
[73]
SMITH, J. M., AND SMITH, D. C. 1977. Database abstraction: Aggregation and generalization. ACM Trans. Database Syst. 2, 2 (June), 105-133.
[74]
TUCHERMAN, L., FURTADO, A. L., ANO CASANOVA, M. A. 1985. A tool for modular database design. In Proceedings of the 11th International Con{erence on Very Large Data Bases (Stockholm, Sweden). Morgan Kaufmann, Los Altos, Calif.
[75]
ULLMAN, J. D. 1982. Principles of Database Systems, 2nd ed. Computer Science Press, Potomac, Md.
[76]
WEELDREYER, J. A. 1986. Structural aspects of the entity-category-relationship model of data, Tech. Rep. HR-80-251, Honeywell Computer Sciences Center, Golden Valley, Minn.

Cited By

View all
  • (2024)SARS-CoV-2 Genomic Epidemiology Dashboards: A Review of Functionality and Technological Frameworks for the Public Health ResponseGenes10.3390/genes1507087615:7(876)Online publication date: 3-Jul-2024
  • (2024)Automatic conceptual database design based on heterogeneous source artifactsComputer Science and Information Systems10.2298/CSIS240229065B21:4(1913-1961)Online publication date: 2024
  • (2024)Usability of Health Care Price Transparency Data in the United States: Mixed Methods StudyJournal of Medical Internet Research10.2196/5062926(e50629)Online publication date: 29-Mar-2024
  • Show More Cited By

Recommendations

Reviews

Csaba Joseph Egyhazy

Schema integration, as defined by the authors, occurs in two contexts: (1) view integration (in database design), which produces a global conceptual description of a proposed database; and (2) database integration (in distributed database management), which produces the global schema of a collection of databases. To understand the complexity of schema integration, the authors begin by identifying some of the causes for schema diversity. This is followed by a number of comparisons of 12 established schema integration methodologies. The issue of the diversity of data models is resolved by adopting a uniform treatment of concepts, based primarily on the entity-relationship model. The conforming of schemas problem then amounts to resolving type, dependency, key, and behavioral conflicts. The resolution of the conflicts leads to schema transformations. This is an activity usually undertaken by designers, in close collaboration with users. However, as noted by the authors, most of the schema transformations are geared for a removal of redundancy, as opposed to simplification or logical optimization. One of the most disturbing revelations of the paper, which I found particularly noteworthy, was the absence of existing specialized languages or data structures for automating at least some of the four major schema integration activities common to all 12 methodologies identified in the paper. Additionally, only a few of these methodologies provide explicit tools or procedures to carry out the process of resolution beyond renaming, redundancy elimination, and generalization. The more difficult ones, such as integrity constraints and language and data structure incompatibilities, remain, for the most part, unresolved. And finally, as noted by the authors: “none [of these methodologies] provide an analysis or proof of the completeness of the schema transformation operations from the standpoint of being able to resolve any type of conflict that can arise.” This leads to the conclusion that none of the methodologies are based on any established mathematical theory and are merely engaged in defining a consensus schema by possibly changing some user views. This approach, first suggested over ten years ago, should be challenged by the slow, incoming influence of applied database logic among schema integration researchers.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 18, Issue 4
Dec. 1986
74 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/27633
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 December 1986
Published in CSUR Volume 18, Issue 4

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,332
  • Downloads (Last 6 weeks)122
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SARS-CoV-2 Genomic Epidemiology Dashboards: A Review of Functionality and Technological Frameworks for the Public Health ResponseGenes10.3390/genes1507087615:7(876)Online publication date: 3-Jul-2024
  • (2024)Automatic conceptual database design based on heterogeneous source artifactsComputer Science and Information Systems10.2298/CSIS240229065B21:4(1913-1961)Online publication date: 2024
  • (2024)Usability of Health Care Price Transparency Data in the United States: Mixed Methods StudyJournal of Medical Internet Research10.2196/5062926(e50629)Online publication date: 29-Mar-2024
  • (2024)Key Technologies of Government Data Heterogeneity and Interoperation: A Survey2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)10.1109/IAEAC59436.2024.10503635(1673-1679)Online publication date: 15-Mar-2024
  • (2024)Conceptual Modeling for BioinformaticsReference Module in Life Sciences10.1016/B978-0-323-95502-7.00003-8Online publication date: 2024
  • (2024)An Ostensive Information Architecture to Enhance Semantic Interoperability for Healthcare Information SystemsInformation Systems Frontiers10.1007/s10796-023-10379-526:1(277-300)Online publication date: 1-Feb-2024
  • (2024)Transformation and Integration of Exchange FormatsFundamentals of Information Systems Interoperability10.1007/978-3-031-48322-6_3(53-106)Online publication date: 19-Apr-2024
  • (2023)A Semantic Web-Based Systems Integration to Enhance the Quality of Supply Chain ManagementInformation Logistics for Organizational Empowerment and Effective Supply Chain Management10.4018/979-8-3693-0159-3.ch005(73-107)Online publication date: 5-Dec-2023
  • (2023)Towards building knowledge by merging multiple ontologies with CoMergerApplied Ontology10.3233/AO-23002018:4(307-341)Online publication date: 1-Jan-2023
  • (2023)60 Years of Databases (final part)PROBLEMS IN PROGRAMMING10.15407/pp2023.01.066(66-103)Online publication date: Jan-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media