Abstract
The technique of k-anonymization allows the releasing of databases that contain personal information while ensuring some degree of individual privacy. Anonymization is usually performed by generalizing database entries. We formally study the concept of generalization, and propose two information-theoretic measures for capturing the amount of information that is lost during the anonymization process. Those measures are more general and more accurate than those proposed in [19] and [1]. We study the problem of achieving k-anonymity with minimal loss of information. We prove that it is NP-hard and study polynomial approximations for the optimal solution. Our first algorithm gives an approximation guarantee of O(ln k) – an improvement over the best-known O(k)-approximation of [1]. As the running time of the algorithm is O(n 2k), we also show how to adapt the algorithm of [1] in order to obtain an O(k)-approximation algorithm that is polynomial in both n and k.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363. Springer, Heidelberg (2004)
Aggarwal, G., Mishra, N., Pinkas, B.: Secure computation of the kth-ranked element. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027. Springer, Heidelberg (2004)
Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving data mining lgorithms. In: PODS (2001)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD (2000)
Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving OLAP. In: SIGMOD (2005)
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: The SuLQ framework. In: PODS (2005)
Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378. Springer, Heidelberg (2005)
Chvatal, V.: A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4(3), 233–235 (1979)
DeWaal, A.G., Willenborg, L.C.R.J.: Information loss through global recoding and local suppression. Netherlands Official Statistics, Special issue on SDC 14, 17–20 (1999)
Dinur, I., Nissim, K.: Revealing information while preserving privacy. In: PODS (2003)
Dwork, C., Nissim, K.: Privacy-preserving data mining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152. Springer, Heidelberg (2004)
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: PODS (2003)
Freedman, M., Nissim, K., Pinkas, B.: Efficient private matching and set intersection. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027. Springer, Heidelberg (2004)
Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game or A completeness theorem for protocols with honest majority. In: STOC (1987)
Johnson, D.S.: Approximation algorithms for combinatorial problems. JCSS 9, 256–278 (1974)
Kenthapadi, K., Mishra, N., Nissim, K.: Simulatable auditing. In: PODS (2005)
Kleinberg, J., Papadimitriou, C., Raghavan, P.: Auditing boolean attributes. JCSS 6, 244–253 (2003)
Lindell, Y., Pinkas, B.: Privacy preserving data mining. Journal of Cryptology 15(3), 177–206 (2002)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: PODS (2004)
Samarati, P.: Protecting respondent’s privacy in microdata release. TKDE 13, 1010–1027 (2001)
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: PODS (1998)
Sweeney, L.: k-Anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Willenborg, L., DeWaal, T.: Elements of Statistical Disclosure Control. Springer, Heidelberg (2001)
Yao, A.: How to generate and exchange secrets. In: FOCS (1986)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gionis, A., Tassa, T. (2007). k-Anonymization with Minimal Loss of Information. In: Arge, L., Hoffmann, M., Welzl, E. (eds) Algorithms – ESA 2007. ESA 2007. Lecture Notes in Computer Science, vol 4698. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75520-3_40
Download citation
DOI: https://doi.org/10.1007/978-3-540-75520-3_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75519-7
Online ISBN: 978-3-540-75520-3
eBook Packages: Computer ScienceComputer Science (R0)