[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Beyond Multivariate Microaggregation for Large Record Anonymization

  • Conference paper
  • First Online:
Citizen in Sensor Networks (CitiSens 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8313))

Included in the following conference series:

  • 946 Accesses

Abstract

Microaggregation is one of the most commonly employed microdata protection methods. The basic idea of microaggregation is to anonymize data by aggregating original records into small groups of at least \(k\) elements and, therefore, preserving \(k\)-anonymity. Usually, in order to avoid information loss, when records are large, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. This is called multivariate microaggregation. By using this technique, the information loss after collapsing several values to the centroid of their group is reduced. Unfortunately, with multivariate microaggregation, the \(k\)-anonymity property is lost when at least two attributes of different blocks are known by the intruder, which might be the usual case.

In this work, we present a new microaggregation method called one dimension microaggregation (\(Mic1D-k\)). With \(Mic1D-k\), the problem of \(k\)-anonymity loss is mitigated by mixing all the values in the original microdata file into a single non-attributed data set using a set of simple pre-processing steps and then, microaggregating all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Comput. Surv. 21, 515–556 (1989)

    Article  Google Scholar 

  2. Aggarwal, C.: On \(k\)-anonymity and the curse of dimensionality. In: Proceedings of the 31st International Conference on Very Large Databases, pp. 901–909 (2005)

    Google Scholar 

  3. Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. In: Proceedings of the 25th ACM Symposium on Principles of Databases Systems, pp. 153–162 (2006)

    Google Scholar 

  4. CASC: Computational Aspects of Statistical Confidentiality, European Project IST-2000-25069, http://neon.vb.cbs.nl/casc

  5. Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata, pp. 91–110 of [8] (2001)

    Google Scholar 

  6. Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata, pp. 111–133 of [8] (2001)

    Google Scholar 

  7. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)

    Article  Google Scholar 

  8. Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies. Elsevier Science, New York (2001)

    Google Scholar 

  9. Felso, F., Theeuwes, J., Wagner, G.: Disclosure limitation in use: results of a survey, pp. 17–42 of [8] (2001)

    Google Scholar 

  10. Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: Proceedings of the 21st IEEE International Conference on Data, Engineering, pp. 205–216 (2005)

    Google Scholar 

  11. Hansen, S., Mukherjee, S.: A polynomial algorithm for optimal univariate microaggregation. Trans. Knowl. Data Eng. 15(4), 1043–1044 (2003)

    Article  Google Scholar 

  12. Jolliffe, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (2002). ISBN: 978-0-387-95442-4

    MATH  Google Scholar 

  13. Larsen, R.J., Marx, M.L.: An Introduction to Mathematical Statistics and Its Applications, 3rd edn. Prentice Hall, Upper Saddle River (2005). ISBN-10: 0131867938

    Google Scholar 

  14. Mateo-Sanz, J.M., Domingo-Ferrer, J.: A method for data-oriented multivariate microaggregation. In: Statistical Data Protection for Official Publications of the European, Communities, pp. 89–99

    Google Scholar 

  15. Murphy, P., M., Aha, D.W.: UCI Repository machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html, University of California, Department of Information and Computer Science, Irvine, CA (1994)

  16. Nin, J., Torra, V.: Empirical analysis of database privacy using twofold integrals. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005, vol. 3801, pp. 1–8. LNAI. Springer, Heidelberg (2005)

    Google Scholar 

  17. Nin, J., Herranz, J., Torra, V.: On the disclosure risk of multivariate microaggregation. Data. Knowl. Eng. (DKE), Elsevier 67(3), 399–412 (2008)

    Article  Google Scholar 

  18. Nin, J., Herranz, J., Torra, V.: How to group attributes in multivariate microaggregation. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 16(1), 121–138 (2008)

    Article  Google Scholar 

  19. Nin, J., Herranz, J., Torra, V.: Towards a more realistic disclosure risk assessment. In: Domingo-Ferrer, J., Saygın, Y. (eds.) PSD 2008, vol. 5262, pp. 152–165. LNCS. Springer, Heidelberg (2008)

    Google Scholar 

  20. Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. Stat. J. United Nations Econ. Comm. Europe 18(4), 345–354 (2000)

    Google Scholar 

  21. Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey, Esprit SDC Project, Deliverable MI-3/D2 (1999)

    Google Scholar 

  22. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: \(k\)-anonymity and its enforcement through generalization and suppression. SRI International technical reports (1998)

    Google Scholar 

  23. Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 459–476 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  24. Sebé, F., Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Post-masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases, vol. 2316, pp. 163–171. LNCS. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  25. Sweeney, L.: Achieving \(k\)-anonymity privacy protection using generalization and suppression. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 571–588 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  26. Sweeney, L.: \(k\)-anonymity: a model for protecting privacy. Int. J. Unc. Fuzz. Knowl. Based Syst. 10(5), 557–570 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  27. U.S. Census Bureau, Data Extraction System. http://www.census.gov/ (1990)

  28. Willenborg, L., Waal, T.: Elements of Statistical Diclosure Control. Lecture Notes in Statistics. Springer, New York (2001)

    Book  Google Scholar 

Download references

Acknowledgments

This work is partially supported by the Ministry of Science and Technology of Spain under contract TIN2012-34557 and by the BSC-CNS Severo Ochoa program (SEV-2011-00067)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jordi Nin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Nin, J. (2014). Beyond Multivariate Microaggregation for Large Record Anonymization. In: Nin, J., Villatoro, D. (eds) Citizen in Sensor Networks. CitiSens 2013. Lecture Notes in Computer Science(), vol 8313. Springer, Cham. https://doi.org/10.1007/978-3-319-04178-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-04178-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-04177-3

  • Online ISBN: 978-3-319-04178-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics