With evolution of scientific knowledge, knowledge within a scientific domain is accumulated within multiple databases maintained by separate groups of researchers. While information across all these databases is semantically related, the individual databases are characterized by different data models, query languages, and user interfaces. There is a growing need for exchanging such semantically related data for detection of redundancies and inconsistencies in both the primary (experimental) data as well as the secondary (inferred) data contained in the databases. However, due to the heterogeneities involved, it is difficult for users of these structurally different databases to query each other's data. To address such a problem, this dissertation proposes a system that allows users to query a set of semantically similar databases based on their existing schema knowledge.
The proposed system consists of a query mapping module that automatically maps a (source) query issued against one database schema to an equivalent (target) query against another database schema. The source query may be composed directly by the user using a query language such as SQL. Alternatively, the user can specify the query via a graphical front end (e.g., a query form). Such a graphical query interface is generated dynamically and is tailored to the schema knowledge of the user. The user input, which is accepted by the graphical interface, is converted into the corresponding database query by a “query generator”.
Underlying our system is a metadata, model that extends the entity-relationship (ER) model to describe and establish mappings between components among individual databases. The metadata model also includes constructs such as inheritance and aggregation. To facilitate database comparisons, additional information, which is not modeled explicitly in the original database schemas, is described in the metamodel. Such additional information is modeled in the form of hidden entities/attributes/relationships, meta-attributes, and domain constraints.
In addition to describing database objects, the model captures metadata that are used to dynamically generate schema-specific graphical query interfaces. As a demonstration, we apply our system to perform query mappings between two genome databases, namely, DB/12 and GDB.
Recommendations
Semantic Interoperability Between Relational Database Systems
IDEAS '07: Proceedings of the 11th International Database Engineering and Applications SymposiumRelational DataBase Systems (RDBSs) are well-known and widely used in many organizations, however, semantic conflicts between the participating RDBSs must be resolved before data can be exchanged between them. Semantic resolution between the RDBSs is ...
NoSql Database Optimization Based on Metadata About Queries and Relationships Between Objects
Computational Science and Its Applications – ICCSA 2022 WorkshopsAbstractMost experts in the field of big data agree that the volume of data generated by various devices will increase exponentially in the future. Therefore, there is no doubt the relevance of solving the problem of data storage in such a way that access ...