[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1325851.1325882dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
research-article

Extending dependencies with conditions

Published: 23 September 2007 Publication History

Abstract

This paper introduces a class of conditional inclusion dependencies (CINDs), which extends traditional inclusion dependencies (INDs) by enforcing bindings of semantically related data values. We show that CINDs are useful not only in data cleaning, but are also in contextual schema matching [7]. To make effective use of CINDs in practice, it is often necessary to reason about them. The most important static analysis issue concerns consistency, to determine whether or not a given set of CINDs has conflicts. Another issue concerns implication, i.e., deciding whether a set of CINDs entails another CIND. We give a full treatment of the static analyses of CINDs, and show that CINDs retain most nice properties of traditional INDs: (a) CINDs are always consistent; (b) CINDs are finitely axiomatizable, i.e., there exists a sound and complete inference system for implication of CINDs; and (c) the implication problem for CINDs has the same complexity as its traditional counterpart, namely, PSPACE-complete, in the absence of attributes with a finite domain; but it is EXPTIME-complete in the general setting. In addition, we investigate the interaction between CINDs and conditional functional dependencies (CFDs), an extension of functional dependencies proposed in [9]. We show that the consistency problem for the combination of CINDs and CFDs becomes undecidable. In light of the undecidability, we provide heuristic algorithms for the consistency analysis of CFDs and CINDs, and experimentally verify the effectiveness and efficiency of our algorithms.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. TPLP, 3(4--5):393--424, 2003.
[3]
F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel-Schneider, editors. The Description Logic Handbook --- Theory, Implementation and Applications. Cambridge University Press, 2003.
[4]
M. Baudinet, J. Chomicki, and P. Wolper. Constraint-Generating Dependencies. JCSS, 59(1):94--115, 1999.
[5]
C. Beeri and M. Vardi. A proof procedure for data dependencies. JACM, 31(4):718--741, 1984.
[6]
L. Bertossi. Consistent query answering in databases. SIGMOD Rec., 35(2):68--76, 2006.
[7]
P. Bohannon, E. Elnahrawy, W. Fan, and M. Flaster. Putting context into schema matching. In VLDB, 2006.
[8]
P. Bohannon, W. Fan, M. Flaster, and R. Rastogi. A cost-based model and effective heuristic for repairing constraints by value modification. In SIGMOD, pages 143--154, 2005.
[9]
P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Conditional functional dependencies for data cleaning. In ICDE, 2007.
[10]
F. Bry, N. Eisinger, H. Schütz, and S. Torge. SIC: Satisfiability checking for integrity constraints. In DDLP, pages 25--36, 1998.
[11]
M. A. Casanova, R. Fagin, and C. H. Papadimitriou. Inclusion dependencies and their interaction with functional dependencies. JCSS, 28(1):29--59, 1984.
[12]
B. S. Chlebus. Domino-tiling games. JCSS, 32(3):374--392, 1986.
[13]
J. Chomicki and J. Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Information and Computation, 197(1--2):90--121, 2005.
[14]
A. Deutsch, L. Popa, and V. Tannen. Query reformulation with constraints. SIGMOD Record, 35(1):65--73, 2006.
[15]
E. Franconi, A. L. Palma, N. Leone, S. Perri, and F. Scarcello. Census data repair: a challenging application of disjunctive logic programming. In LPAR, pages 561--578, 2001.
[16]
L. Haas, M. Hernández, H. Ho, L. Popa, and M. Roth. Clio grows up: from research prototype to industrial tool. In SIGMOD, 2005.
[17]
D. S. Johnson and A. Klug. Testing containment of conjunctive queries under functional and inclusion dependencies. JCSS, 28(1):167--189, 1984.
[18]
P. G. Kolaitis. Schema mappings, data exchange, and metadata management. In PODS, 2005.
[19]
Lens Computer Science Research Centre. SAT4j home page, 2003. http://www.sat4j.org/.
[20]
A. Lopatenko and L. Bertossi. Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In ICDT, 2007.
[21]
M. J. Maher. Constrained dependencies. Theoretical Computer Science, 173(1):113--149, 1997.
[22]
M. J. Maher and D. Srivastava. Chasing Constrained Tuple-Generating Dependencies. In PODS, 1996.
[23]
R. Manthey. Satisfiability of integrity constraints: Reflections on a neglected problem. In FMLDO, pages 169--179, 1990.
[24]
E. Rahm and H. H. Do. Data cleaning: Problems and current approaches. IEEE Data Eng. Bull., 23(4):3--13, 2000.
[25]
J. Wijsen. Database repairing using updates. TODS, 30(3):722--768, 2005.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '07: Proceedings of the 33rd international conference on Very large data bases
September 2007
1443 pages
ISBN:9781595936493

Sponsors

  • Yahoo! Research
  • Google Inc.
  • SAP
  • Intel: Intel
  • Microsoft Research: Microsoft Research
  • ORACLE: ORACLE
  • Connex.cc
  • HP invent
  • WKO
  • IBM: IBM

Publisher

VLDB Endowment

Publication History

Published: 23 September 2007

Qualifiers

  • Research-article

Conference

VLDB '07
Sponsor:
  • Intel
  • Microsoft Research
  • ORACLE
  • IBM

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Cost-efficient data acquisition on online data marketplaces for correlation analysisProceedings of the VLDB Endowment10.14778/3297753.329775712:4(362-375)Online publication date: 1-Dec-2018
  • (2017)Discovering Conditional Matching RulesACM Transactions on Knowledge Discovery from Data10.1145/307064711:4(1-38)Online publication date: 29-Jun-2017
  • (2017)Dependable Data Repairing with Fixing RulesJournal of Data and Information Quality10.1145/30417618:3-4(1-34)Online publication date: 30-Jun-2017
  • (2017)Data ProfilingProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3054772(1747-1751)Online publication date: 9-May-2017
  • (2017)A Novel Cost-Based Model for Data RepairingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.263792829:4(727-742)Online publication date: 1-Apr-2017
  • (2016)RDFindProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2915206(953-967)Online publication date: 26-Jun-2016
  • (2016)Quality-Based Online Data ReconciliationACM Transactions on Internet Technology10.1145/280688816:1(1-21)Online publication date: 1-Feb-2016
  • (2015)Messing up with BARTProceedings of the VLDB Endowment10.14778/2850578.28505799:2(36-47)Online publication date: 1-Oct-2015
  • (2015)Data QualityACM SIGMOD Record10.1145/2854006.285400844:3(7-18)Online publication date: 3-Dec-2015
  • (2014)Query Rewriting and Optimization for Ontological DatabasesACM Transactions on Database Systems10.1145/263854639:3(1-46)Online publication date: 7-Oct-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media