Abstract
Database provenance chronicles the history of updates and modifications to data, and has received much attention due to its central role in scientific data management. However, the use of provenance information still requires a leap of faith. Without additional protections, provenance records are vulnerable to accidental corruption, and even malicious forgery, a problem that is most pronounced in the loosely-coupled multi-user environments often found in scientific research.
This paper investigates the problem of providing integrity and tamper-detection for database provenance. We propose a checksum-based approach, which is well-suited to the unique characteristics of database provenance, including non-linear provenance objects and provenance associated with multiple fine granularities of data. We demonstrate that the proposed solution satisfies a set of desirable security properties, and that the additional time and space overhead incurred by the checksum approach is manageable, making the solution feasible in practice.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Secure hash standard. Federal Information Processing Standards Publication (FIPS PUB) 180(1) (April 1995)
Agrawal, R., Bayardo, R., Faloutsos, C., Kiernan, J., Rantzau, R., Srikant, R.: Auditing compliance with a hippocratic database. In: VLDB (2004)
Annis, J., Zhao, Y., Vöckler, J.-S., Wilde, M., Kent, S., Foster, I.: Applying chimera virtual data concepts to cluster finding in the sloan sky survey. In: Proceedings of the ACM / IEEE Conference on Supercomputing (2002)
Benjelloun, O., Das Sarma, A., Halevy, A., Widom, J.: ULDBs: Databases with uncertainty and lineage. In: VLDB (2006)
Bhagwat, D., Chiticariu, L., Tan, W.-C., Vijayvargiya, G.: An annotation management system for relational databases. In: VLDB (2004)
Braun, U., Shinnar, A., Seltzer, M.: Securing provenance. In: USENIX (July 2008)
Buneman, P., Chapman, A., Cheney, J.: Provenance management in curated databases. In: ACM SIGMOD (2006)
Buneman, P., Cheney, J., Vansummeren, S.: On the expressiveness of implicit provenance in query and update languages. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 209–223. Springer, Heidelberg (2006)
Buneman, P., Khanna, S., Tan, W.-C.: What and where: A characterization of data provenance. LNCS (2001)
Callahan, S.P., Freire, J., Santos, E., Scheidegger, C.E., Silvaand, C.T., Vo, H.T.: VisTrails: Visualization meets data management. In: ACM SIGMOD (2006)
Chapman, A., Jagadish, H.V., Ramanan, P.: Efficient provenance storage. In: ACM SIGMOD (2008)
Chebotko, A., Chang, S., Lu, S., Fotouhi, F., Yang, P.: Scientific workflow provenance querying with security views. In: WAIM (2008)
Cirillo, A., Jagadeesan, R., Pitcher, C., Riely, J.: Tapido: Trust and Authorization Via Provenance and Integrity in Distributed Objects. In: Drossopoulou, S. (ed.) ESOP 2008. LNCS, vol. 4960, pp. 208–223. Springer, Heidelberg (2008)
Davidson, S., Cohen-Boulakia, S., Eyal, A., Ludascher, B., McPhillips, T., Bowers, S., Freire, J.: Provenance in scientific workflow systems. IEEE Data Engineering Bulletin 32(4) (2007)
Devanbu, P., Gertz, M., Kwong, A., Martel, C., Nuckolls, G., Stubblebine, S.: Flexible authentication of XML documents. Journal of Computer Security 12(6) (2004)
Devanbu, P., Gertz, M., Martel, C., Stubblebine, S.: Authentic third-party data publication. In: Proceedings of the IFIP 11.3 Workshop on Database Security (2000)
Foster, I., Vockler, J., Eilde, M., Zhao, Y.: Chimera: A virtual data system for representing, querying, and automating data derivation. In: SSDBM (July 2002)
Frew, J., Metzger, D., Slaughter, P.: Automatic capture and reconstruction of computational provenance. Concurr. Comput.: Pract. Exper. 20(5), 485–496 (2008)
Groth, P., Miles, S., Fang, W., Wong, S., Zauner, K.-P., Moreau, L.: Recording and using provenance in a protein compressibility experiment. In: IEEE International Symposium on High Performance Distributed Computing (2005)
Groth, P., Miles, S., Moreau, L.: PReServ: Provenance recording for services. In: Proceedings of the UK OST e-Science second All Hands Meeting 2005, AHM 2005 (2005)
Hasan, R., Sion, R., Winslett, M.: Introducing secure provenance: Problems and challenges. In: International Workshop on Storage Security and Survivability (2007)
Hasan, R., Sion, R., Winslett, M.: The case of the fake picasso: Preventing history forgery with secure provenance. In: FAST (2009)
Khan, I., Schroeter, R., Hunter, J.: Implementing a Secure Annotation Service. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 212–221. Springer, Heidelberg (2006)
Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Dynamic authenticated index structures for outsourced databases. In: ACM SIGMOD (2006)
Merkle, R.: A certified digital signature. In: Proceedings of the 9th Annual International Cryptology Conference (1989)
Miklau, G., Suciu, D.: Managing integrity for data exchanged on the web. In: WebDB (2005)
Muniswamy-Reddy, K., Holland, D., Braun, U., Seltzer, M.: Provenance-aware storage systems. In: USENIX (2006)
Naor, M., Nissim, K.: Certificate revocation and certificate update. In: USENIX (1998)
Oinn, T., Greenwood, M., Addis, M., Alpdemir, M.N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D., Li, P., Lord, P., Pocock, M.R., Senger, M., Stevens, R., Wipat, A., Wroe, C.: Taverna: lessons in creating a workflow environment for the life sciences: Research articles. Concurr. Comput.: Pract. Exper. 18(10) (2006)
Open provenance model (2008), http://twiki.ipaw.info/bin/view/Challenge/OPM
Pang, H., Jain, A., Ramamritham, K., Tan, K.: Verifying completeness of relational query results in data publishing. In: ACM SIGMOD (2005)
Peha, J.M.: Electronic commerce with verifiable audit trails. In: Internet Society (1999)
Rivest, R.: The MD5 message digest algorithm (1992)
Rivest, R., Shamir, A., Adelman, L.: A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM 21(2) (1978)
Rosenthal, A., Seligman, L., Chapman, A., Blaustein, B.: Scalable access controls for lineage. In: Workshop on the Theory and Practice of Provenance (2009)
Schneier, B., Kelsey, J.: Secure audit logs to support computer forensics. ACM Transactions on Information and System Security 2(2) (1999)
Snodgrass, R., Yao, S., Collberg, C.: Tamper detection in audit logs. In: VLDB (2004)
Tan, V., Groth, P., Miles, S., Jiang, S., Munroe, S., Tsasakou, S., Moreau, L.: Security Issues in a SOA-Based Provenance System. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 203–211. Springer, Heidelberg (2006)
Tsai, W.T., Wei, X., Chen, Y., Paul, R., Chung, J.-Y., Zhang, D.: Data provenance in SOA: security, reliability, and integrity. Journal Service Oriented Computing and Applications (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, J., Chapman, A., LeFevre, K. (2009). Do You Know Where Your Data’s Been? – Tamper-Evident Database Provenance. In: Jonker, W., Petković, M. (eds) Secure Data Management. SDM 2009. Lecture Notes in Computer Science, vol 5776. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04219-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-04219-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04218-8
Online ISBN: 978-3-642-04219-5
eBook Packages: Computer ScienceComputer Science (R0)