[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Extending inclusion dependencies with conditions

Published: 01 January 2014 Publication History

Abstract

This paper introduces a class of conditional inclusion dependencies (CINDs), which extends inclusion dependencies (INDs) by enforcing patterns of semantically related data values. We show that CINDs are useful not only in data cleaning, but also in contextual schema matching. We give a full treatment of the static analysis of CINDs, and show that CINDs retain most desired properties of traditional INDs: (a) CINDs are always satisfiable; (b) CINDs are finitely axiomatizable, i.e., there exists a sound and complete inference system for the implication analysis of CINDs; and (c) the implication problem for CINDs has the same complexity as its traditional counterpart, namely, PSPACE-complete, in the absence of attributes with a finite domain; but it is EXPTIME-complete in the general setting. In addition, we investigate the interaction between CINDs and conditional functional dependencies (CFDs), as well as two practical fragments of CINDs, namely acyclic CINDs and unary CINDs. We show the following: (d) the satisfiability problem for the combination of CINDs and CFDs becomes undecidable, even in the absence of finite-domain attributes; (e) in the absence of finite-domain attributes, the implication problem for acyclic CINDs and for unary CINDs retains the same complexity as its traditional counterpart, namely, NP-complete and PTIME, respectively; but in the general setting, it becomes PSPACE-complete and coNP-complete, respectively; and (f) the implication problem for acyclic unary CINDs remains in PTIME in the absence of finite-domain attributes and coNP-complete in the general setting.

References

[1]
Fan, W., Geerts, F., Jia, X. and Kementsietsidis, A., Conditional functional dependencies for capturing data inconsistencies. ACM Trans. Database Syst. v33 i2.
[2]
Cong, G., Fan, W., Geerts, F., Jia, X. and Ma, S., Improving data quality: Consistency and accuracy. In: VLDB, pp. 315-326.
[3]
Chiang, F. and Miller, R.J., Discovering data quality rules. PVLDB. v1 i1. 1166-1177.
[4]
Golab, L., Karloff, H.J., Korn, F., Srivastava, D. and Yu, B., On generating near-optimal tableaux for conditional functional dependencies. PVLDB. v1 i1. 376-390.
[5]
Fan, W., Gao, H., Jia, X., Li, J. and Ma, S., Dynamic constraints for record matching. VLDB J. v20 i4. 495-520.
[6]
Fan, W., Li, J., Ma, S., Tang, N. and Yu, W., Interaction between record matching and data repairing. In: SIGMOD, pp. 469-480.
[7]
Fan, W., Li, J., Tang, N. and Yu, W., Incremental detection of inconsistencies in distributed data. In: ICDE, pp. 318-329.
[8]
Bohannon, P., Fan, W., Flaster, M. and Rastogi, R., A cost-based model and effective heuristic for repairing constraints by value modification. In: SIGMOD, pp. 143-154.
[9]
Chomicki, J. and Marcinkowski, J., Minimal-change integrity maintenance using tuple deletions. Inf. Comput. v197 i1-2. 90-121.
[10]
Haas, L., Hernández, M., Ho, H., Popa, L. and Roth, M., Clio grows up: from research prototype to industrial tool. In: SIGMOD, pp. 805-810.
[11]
Bohannon, P., Elnahrawy, E., Fan, W. and Flaster, M., Putting context into schema matching. In: VLDB, pp. 307-318.
[12]
Bravo, L., Fan, W. and Ma, S., Extending dependencies with conditions. In: VLDB, pp. 243-254.
[13]
Abiteboul, S., Hull, R. and Vianu, V., Foundations of Databases. 1995. Addison-Wesley.
[14]
Cosmadakis, S.S. and Kanellakis, P.C., Functional and inclusion dependencies a graph theoretic approach. In: PODS, pp. 29-37.
[15]
Cosmadakis, S.S., Kanellakis, P.C. and Vardi, M.Y., Polynomial-time implication problems for unary inclusion dependencies. J. ACM. v37 i1. 15-46.
[16]
Kolaitis, P.G., Schema mappings, data exchange, and metadata management. In: PODS, pp. 61-75.
[17]
Ginsburg, S. and Spanier, E.H., On completing tables to satisfy functional dependencies. Theor. Comput. Sci. v39. 309-317.
[18]
Beeri, C. and Vardi, M., A proof procedure for data dependencies. J. ACM. v31 i4. 718-741.
[19]
Fagin, R. and Vardi, M.Y., Armstrong databases for functional and inclusion dependencies. Inf. Process. Lett. v16 i1. 13-19.
[20]
Casanova, M.A., Fagin, R. and Papadimitriou, C.H., Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. v28 i1. 29-59.
[21]
Savitch, W.J., Relationships between nondeterministic and deterministic tape complexities. J. Comput. Syst. Sci. v4. 177-192.
[22]
Chlebus, B.S., Domino-tiling games. J. Comput. Syst. Sci. v32 i3. 374-392.
[23]
van Emde Boas, P., The convenience of tilings. In: Complexity, Logic, and Recursion Theory, Marcel Dekker.
[24]
Sciore, E., Comparing the universal instance and relational data models. Adv. Comput. Res. v3. 139-162.
[25]
Bauckmann, J., Abedjan, Z., Leser, U., Müller, H. and Naumann, F., Discovering conditional inclusion dependencies. In: CIKM, pp. 2094-2098.
[26]
Curé, O., Improving the data quality of drug databases using conditional dependencies and ontologies. J. Data Inf. Qual. v4 i1. 3
[27]
Immerman, N., Nondeterministic space is closed under complementation. SIAM J. Comput. v17 i5. 935-938.
[28]
Szelepcsényi, R., The meothod of focing for nondeterministic automata. Bull. Eur. Assoc. Theor. Comput. Sci. v33. 96-99.
[29]
Papadimitriou, C.H., Computational Complexity. 1994. Addison-Wesley.
[30]
Garey, M. and Johnson, D., Computers and Intractability: A Guide to the Theory of NP-Completeness. 1979. W.H. Freeman and Company.
[31]
Codd, E.F., Relational completeness of data base sublanguages. In: Courant Comput. Sci. Symp. Ser., vol. 6. Prentice-Hall. pp. 65-98.
[32]
Fagin, R. and Vardi, M.Y., The theory of data dependencies - an overview. In: ICALP, pp. 1-22.
[33]
Arenas, M., Bertossi, L.E. and Chomicki, J., Answer sets for consistent query answering in inconsistent databases. Theory Pract. Log. Program. v3 i4-5. 393-424.
[34]
Bertossi, L., Consistent query answering in databases. SIGMOD Rec. v35 i2. 68-76.
[35]
Chomicki, J., Consistent query answering: Five easy pieces. In: ICDT, pp. 1-17.
[36]
Fan, W., Dependencies revisited for improving data quality. In: PODS, pp. 159-170.
[37]
Fan, W., Li, J., Ma, S., Tang, N. and Yu, W., Towards certain fixes with editing rules and master data. VLDB J. v21 i2. 213-238.
[38]
Maher, M.J. and Srivastava, D., Chasing constrained tuple-generating dependencies. In: PODS, pp. 128-138.
[39]
Bravo, L., Fan, W., Geerts, F. and Ma, S., Increasing the expressivity of conditional functional dependencies without extra complexity. In: ICDE, pp. 516-525.
[40]
Chen, W., Fan, W. and Ma, S., Incorporating cardinality constraints and synonym rules into conditional functional dependencies. Inf. Process. Lett. v109 i14. 783-789.
[41]
Chen, W., Fan, W. and Ma, S., Analyses and validation of conditional dependencies with built-in predicates. In: DEXA, pp. 576-591.
[42]
Franconi, E., Palma, A.L., Leone, N., Perri, S. and Scarcello, F., Census data repair: a challenging application of disjunctive logic programming. In: LPAR, pp. 561-578.
[43]
Wijsen, J., Database repairing using updates. ACM Trans. Database Syst. v30 i3. 722-768.
[44]
Lopatenko, A. and Bertossi, L., Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In: ICDT, pp. 179-193.
[45]
Arenas, M., Pérez, J., Reutter, J. and Riveros, C., Composition and inversion of schema mappings. SIGMOD Rec. v38 i3. 17-28.
[46]
Lenzerini, M., Data integration: A theoretical perspective. In: PODS, pp. 233-246.
[47]
Johnson, D.S. and Klug, A.C., Testing containment of conjunctive queries under functional and inclusion dependencies. J. Comput. Syst. Sci. v28 i1. 167-189.
[48]
Deutsch, A., Popa, L. and Tannen, V., Query reformulation with constraints. SIGMOD Rec. v35 i1. 65-73.
[49]
Fan, W., Geerts, F., Li, J. and Xiong, M., Discovering conditional functional dependencies. IEEE Trans. Knowl. Data Eng. v23 i5. 683-698.

Cited By

View all
  1. Extending inclusion dependencies with conditions

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Theoretical Computer Science
    Theoretical Computer Science  Volume 515, Issue
    January, 2014
    128 pages

    Publisher

    Elsevier Science Publishers Ltd.

    United Kingdom

    Publication History

    Published: 01 January 2014

    Author Tags

    1. Data cleaning
    2. Implication
    3. Inclusion dependencies
    4. Satisfiability
    5. Schema matching

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reasoning on property graphs with graph generating dependenciesInformation Sciences: an International Journal10.1016/j.ins.2024.120675672:COnline publication date: 1-Jun-2024
    • (2023)Incremental discovery of denial constraintsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00788-y32:6(1289-1313)Online publication date: 17-Mar-2023
    • (2021)Referential Integrity Under Uncertain DataAdvanced Information Systems Engineering10.1007/978-3-030-79382-1_16(265-279)Online publication date: 28-Jun-2021
    • (2019)AutoRepair: an automatic repairing approach over multi-source dataKnowledge and Information Systems10.1007/s10115-018-1284-961:1(227-257)Online publication date: 1-Oct-2019
    • (2019)Data CleaningundefinedOnline publication date: 9-Jul-2019
    • (2017)Discovering context-aware conditional functional dependenciesFrontiers of Computer Science: Selected Publications from Chinese Universities10.5555/3128671.312868711:4(688-701)Online publication date: 1-Aug-2017
    • (2017)Inclusion dependencies and their interaction with functional dependencies in SQLJournal of Computer and System Sciences10.1016/j.jcss.2016.11.00485:C(104-131)Online publication date: 1-May-2017
    • (2017)Logic of temporal attribute implicationsAnnals of Mathematics and Artificial Intelligence10.1007/s10472-016-9526-679:4(307-335)Online publication date: 1-Apr-2017
    • (2016)A Survey on Accessing DataspacesACM SIGMOD Record10.1145/3003665.300367245:2(33-44)Online publication date: 28-Sep-2016
    • (2016)RDFindProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2915206(953-967)Online publication date: 26-Jun-2016
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media