Abstract
The description of the origins of a piece of data and the transformations by which it arrived in a database is called data provenance, lineage or pedigree. The two major approaches to represent provenance information use annotations and inversion. Annotations are flexible in representing diverse provenance metadata but the complete provenance data may outsize the data itself. The inversion method is concise by using a single inverse query or function but the provenance needs to be computed on-the-fly which can be expensive. This paper proposes a new approach of provenance storage which combines the two methods and is adaptive to storage constraint.
Supported by Specialized Research Fund for the Doctoral Program of Higher Education of China (No.200804861067) and the Special Fund for Basic Scientific Research of Central Colleges, Wuhan University.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bhagwat, D., Chiticariu, L., Tan, W.-C.: An annotation management system for relational databases. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, pp. 900–911 (August 2004)
Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Transactions on Database Systems 25(2), 179–227 (2000)
Buneman, P., Khanna, S., Tan, W.-C.: On propagation of deletions and annotations through views. In: Proceedings of The Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 150–158 (June 2002)
Geerts, F., Kementsietsidis, A., Milano, D.: Mondrian: Annotating and querying databases through colors and blocks. In: Proceedings of the Twenty-second International Conference on Data Engineering, p. 82 (April 2006)
Srivastava, D., Velegrakis, Y.: Intensional associations between data and metadata. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 401–412 (June 2007)
Chapman, A.P., Jagadish, H.V., Ramanan, P.: Efficient provenance storage. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 993–1006 (June 2008)
Foster, I., Vöckler, J., Wilde, M.: Chimera: A virtual data system for representing, querying, and automating data derivation. In: Proceeding of the 14th Conference on Scientific and Statistical Data Management, pp. 37–46 (2002)
Saad, Y.: Sparskit: a basic tool kit for sparse matrix computations. University of Illinois, Tech. Rep. CSRD TR 1029 (1990)
Koster, J.: Parallel templates for numerical linear algebra, a high-performance computation library. Master’s thesis (July 2002)
Hossain, S.: On efficient storage of sparse matrices. In: Computing by the Numbers: Algorithms, Precision, and Complexity Matheon Workshop (2006)
Isenburg, M., Lindstrom, P., Snoeyink, J.: Lossless compression of predicted floating-point geometry, pp. 869–877 (July 2005)
Mechelen, V., Bock, H.-H., Boeck, P.D.: Two mode clustering methods: a structured overview. Statistical Methods in Medical Research 13(5), 363–394 (2004)
McCormick, W.T., Schweitzer, P.J., White, T.W.: Problem Decomposition and Data Reorganization by a Clustering Technique. Operations Research 20(5), 993–1009 (1972)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, L., Köehler, H., Deng, K., Zhou, X., Sadiq, S. (2011). Providing Flexible Tradeoff for Provenance Tracking. In: Chiu, D.K.W., et al. Web Information Systems Engineering – WISE 2010 Workshops. WISE 2010. Lecture Notes in Computer Science, vol 6724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24396-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-24396-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24395-0
Online ISBN: 978-3-642-24396-7
eBook Packages: Computer ScienceComputer Science (R0)