[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/645482.653450guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Supporting Fine-grained Data Lineage in a Database Visualization Environment

Published: 07 April 1997 Publication History

Abstract

The lineage of a datum records its processing history. Because such information can be used to trace the source of anomalies and errors in processed data sets, it is valuable to users for a variety of applications, including the investigation of anomalies and debugging. Traditional data lineage approaches rely on metadata. However, metadata does not scale well to fine-grained lineage, especially in large data sets. For example, it is not feasible to store all of the information that is necessary to trace from a specific floating-point value in a processed data set to a particular satellite image pixel in a source data set. In this paper, we propose a novel method to support fine-grained data lineage. Rather than relying on metadata, our approach lazily computes the lineage using a limited amount of information about the processing operators and the base data. We introduce the notions of weak inversion and verification. While our system does not perfectly invert the data, it uses weak inversion and verification to provide a number of guarantees about the lineage it generates. We propose a design for the implementation of weak inversion and verification in an object-relational database management system.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDE '97: Proceedings of the Thirteenth International Conference on Data Engineering
April 1997
542 pages
ISBN:0818678070

Publisher

IEEE Computer Society

United States

Publication History

Published: 07 April 1997

Author Tags

  1. anomalies
  2. base data
  3. data processing history
  4. data visualisation
  5. database visualization environment
  6. debugging
  7. error sources
  8. fine-grained data lineage
  9. large data sets
  10. lazy algorithm
  11. limited information
  12. lineage guarantees
  13. metadata
  14. object-relational database management system
  15. processed data sets
  16. processing operators
  17. tracing
  18. verification
  19. weak inversion

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Inspector gadgetProceedings of the VLDB Endowment10.14778/3402755.34027584:12(1237-1248)Online publication date: 3-Jun-2020
  • (2019)Data ProvenanceACM SIGMOD Record10.1145/3316416.331641847:3(5-16)Online publication date: 27-Feb-2019
  • (2018)Debugging Distributed Systems with Why-Across-Time ProvenanceProceedings of the ACM Symposium on Cloud Computing10.1145/3267809.3267839(333-346)Online publication date: 11-Oct-2018
  • (2018)Provenance for Interactive VisualizationsProceedings of the Workshop on Human-In-the-Loop Data Analytics10.1145/3209900.3209904(1-8)Online publication date: 10-Jun-2018
  • (2018)Provenance and Probabilities in Relational DatabasesACM SIGMOD Record10.1145/3186549.318655146:4(5-15)Online publication date: 22-Feb-2018
  • (2018)A systematic review of provenance systemsKnowledge and Information Systems10.1007/s10115-018-1164-357:3(495-543)Online publication date: 1-Dec-2018
  • (2017)Diagnosing Machine Learning Pipelines with Fine-grained LineageProceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3078597.3078603(143-153)Online publication date: 26-Jun-2017
  • (2017)Distributed Provenance CompressionProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3035926(203-218)Online publication date: 9-May-2017
  • (2017)Model provenance tracking and inference for integrated environmental modellingEnvironmental Modelling & Software10.1016/j.envsoft.2017.06.05196:C(95-105)Online publication date: 1-Oct-2017
  • (2017)A survey on provenanceThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0486-126:6(881-906)Online publication date: 1-Dec-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media