DUEF-GA: data utility and privacy evaluation framework for graph anonymization

Jordi Casas-Roma ORCID: orcid.org/0000-0002-0617-3303¹

481 Accesses
1 Altmetric
Explore all metrics

Abstract

Anonymization of graph-based data is a problem which has been widely studied over the last years, and several anonymization methods have been developed. Information loss measures have been used to evaluate data utility and information loss in the anonymized graphs. However, there is no consensus about how to evaluate data utility and information loss in privacy-preserving and anonymization scenarios, where the anonymous datasets were perturbed to hinder re-identification processes. Authors use diverse metrics to evaluate data utility and, consequently, it is complex to compare different methods or algorithms in the literature. In this paper, we propose a framework to evaluate and compare anonymous datasets in a common way, providing an objective score to clearly compare methods and algorithms. Our framework includes metrics based on generic information loss measures, such as average distance or betweenness centrality and also task-specific information loss measures, such as community detection or information flow. Additionally, we provide some metrics to examine re-identification and risk assessment. We demonstrate that our framework could help researchers and practitioners to select the best parametrization and/or algorithm to reduce information loss and maximize data utility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

How to Quantify Graph De-anonymization Risks

k-Degree anonymity and edge selection: improving data utility in large networks

Article 27 April 2016

A survey of graph-modification techniques for privacy-preserving on networks

Article 31 May 2016

Notes

Available at: https://github.com/jcasasr/DUEF-GA.

References

Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography. In: WWW ’07, pp. 181–190. ACM, New York, NY, USA (2007)
Barabási, A.-L., Pósfai, M.: Network Science. Cambridge University Press, Cambridge (2016)
Google Scholar
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)
Article MATH Google Scholar
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Article MATH Google Scholar
Brunet, S., Canard, S., Gambs, S., Olivier, B.: Novel differentially private mechanisms for graphs. IACR Cryptol. ePrint Arch. 2016, 745 (2016)
Google Scholar
Bonchi, F., Gionis, A., Tassa, T.: Identity obfuscation in graphs through the information theoretic lens. Inf. Sci. 275, 232–256 (2014)
Article MathSciNet Google Scholar
Cai, B.-J., Wang, H.-Y., Zheng, H.-R., Wang, H.: Evaluation repeated random walks in community detection of social networks. In: 2010 International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1849–1854. IEEE Computer Society, Qingdao (2010)
Casas-Roma, J.: Privacy-preserving on graphs using randomization and edge-relevance. In: Torra, V. (ed.) International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pp. 204–216. Springer International Publishing Switzerland, Tokyo, Japan (2014)
Casas-Roma, J., Herrera-joancomartí, J., Torra, V.: Anonymizing graphs: measuring quality for clustering. Knowl. Inf. Syst. 44(3), 507–528 (2015)
Article Google Scholar
Casas-Roma, J., Herrera-joancomartí, J., Torra, V.: \(k\)-Degree anonymity and edge selection: improving data utility in large networks. Knowl. Inf. Syst. 50(2), 447–474 (2016)
Google Scholar
Casas-Roma, J., Herrera-Joancomartí, J., Torra, V.: A survey of graph-modification techniques for privacy-preserving on networks. Artif. Intell. Rev. 47(3), 341–366 (2017)
Article Google Scholar
Chakrabarti, D., Faloutsos, C.: Graph mining: laws, generators, and algorithms. ACM Comput. Surv. 38(1), 2:1-2:69 (2006)
Article Google Scholar
Chester, S., Kapron, B.M., Ramesh, G., Srivastava, G., Thomo, A., Venkatesh, S.: \(k\)-Anonymization of social networks by vertex addition. In: ADBIS 2011 Research Communications, pp. 107–116. Vienna, Austria, CEUR-WS.org (2011)
Chester, S., Kapron, B.M., Ramesh, G., Srivastava, G., Thomo, A., Venkatesh, S.: Why Waldo befriended the dummy? \(k\)-Anonymization of social networks with pseudo-nodes. Soc. Netw. Anal. Min. 3(3), 381–399 (2013)
Google Scholar
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 66111 (2004)
Article Google Scholar
Dwork, C.: Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming (ICALP), pp 1–12. Springer-Verlag, Berlin (2006)
Ferri, F., Grifoni, P., Guzzo, T.: New forms of social and professional digital relationships: the case of Facebook. Soc. Netw. Anal. Min. 2(2), 121–137 (2011)
Article Google Scholar
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99(12), 7821–7826 (2002)
Article MathSciNet MATH Google Scholar
Guimerà, R., Danon, L., Díaz-Guilera, A., Giralt, F., Arenas, A.: Self-similar community structure in a network of human interactions. Phys. Rev. E 68(6), 65–103 (2003)
Article Google Scholar
Hay, M., Miklau, G., Jensen, D., Weis, P., Srivastava, S.: Anonymizing social networks. Report, University of Massachusetts Amherst (2007)
Hay, M., Miklau, G., Jensen, D., Towsley, D., Weis, P.: Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow. 1(1), 102–114 (2008)
Article Google Scholar
Hay, M., Li, C., Miklau, G., Jensen, D.: Accurate estimation of the degree distribution of private networks. In: IEEE International Conference on Data Mining (ICDM), pp. 169–178. IEEE Computer Society, Miami, FL (2009)
Hay, M., Liu, K., Miklau, G., Pei, J., and Terzi, E. (2011). Privacy-aware data management in information networks. In: International Conference on Management of Data (SIGMOD). ACM Press, New York, USA, pp. 1201–1204
Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J., Van den Broeck, W.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
Article MathSciNet MATH Google Scholar
KONECT: Hamsterster friendships network dataset, April 2017 (2017)
Lancichinetti, A., Fortunato, S.: Community detection algorithms: a comparative analysis. Phys. Rev. E 80(5), 56117 (2009)
Article Google Scholar
Lindner, G., Staudt, C.L., Hamann, M., Meyerhenke, H., Wagner, D.: Structure-preserving sparsification of social networks. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 448–454. ACM, Paris, France (2015)
Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), pp. 93–106. ACM Press, New York (2008)
Macwan, K.R., Patel, S.J.: \(k\)-NMF anonymization in social network data publishing. Comput. J. 61(4), 601–613 (2018)
Google Scholar
Ma, T., Zhang, Y., Cao, J., Shen, J., Tang, M., Tian, Y., Al-Dhelaan, A., Al-Rodhaan, M.: KDVEM: a \(k\)-degree anonymity with vertex and edge modification algorithm. Computing 97(12), 1165–1184 (2015)
MathSciNet MATH Google Scholar
Nagle, F.: Privacy breach analysis in social networks. Mining Social Networks and Security Informatics, pp. 63–77. Springer, Dordrecht (2013)
Chapter Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. In: Proceedings of the 7th International World Wide Web Conference (WWW), pp. 161–172. Brisbane, Australia (1998)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. Computer and Information Sciences (ISCIS), vol. 10, pp. 284–293. Springer, Berlin (2005)
Google Scholar
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)
Article Google Scholar
Sala, A., Zhao, X., Wilson, C., Zheng, H., Zhao, B.Y.: Sharing graphs using differentially private graph models. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference (IMC ’11), Berlin, Germany, pp. 81–98 (2011)
Sarma, A.D., Molla, A.R., Pandurangan, G., Upfal, E.: Fast distributed PageRank computation. Theor. Comput. Sci. 561(B), 113–121 (2015)
Article MathSciNet MATH Google Scholar
Sweeney, L.: \(k\)-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)
MathSciNet MATH Google Scholar
Wagner, I., Eckhoff, D.: Technical privacy metrics. ACM Comput. Surv. 51(3), 1–38 (2018)
Article Google Scholar
Wang, Y., Xie, L., Zheng, B., Lee, K.C.K.: High utility K-anonymization for social network publishing. Knowl. Inf. Syst. (KAIS) 41(3), 697–725 (2014)
Article Google Scholar
Wang, Y., Zheng, B.: Preserving privacy in social networks against connection fingerprint attacks. In: International Conference on Data Engineering (ICDE), pp. 54–65. IEEE, Seoul, South Korea (2015)
Yang, Y., Lutes, J., Li, F., Luo, B., Liu, P.: Stalking online: on user privacy in social networks. In: Proceedings of 2nd ACM Conference on Data and Application Security and Privacy (CODASPY’12), pp. 37–48 (2012)
Ying, X ., Wu, X.: Randomizing social networks: a spectrum preserving approach. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp. 739–750. SIAM, Atlanta (2008)
Ying, X., Pan, K., Wu, X., Guo, L.: Comparisons of randomization and \(k\)-degree anonymization schemes for privacy preserving social network publishing. In: Proceedings of the 3rd Workshop on Social Network Mining and Analysis (SNA-KDD), pp. 10:1-10:10. ACM Press, New York (2009)
Yuan, M., Chen, L., Yu, P.S., Yu, T.: Protecting sensitive labels in social network data anonymization. IEEE Trans. Knowl. Data Eng. 25(3), 633–647 (2013)
Article Google Scholar
Zhang, K., Lo, D., Lim, E.-P., Prasetyo, P.K.: Mining indirect antagonistic communities from social interactions. Knowl. Inf. Syst. 35(3), 553–583 (2013)
Article Google Scholar
Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), pp. 506–515. IEEE Computer Society, Washington (2008)
Zou, L., Chen, L., Özsu, M.T.: \(K\)-Automorphism: a general framework for privacy preserving network publication. Proc. VLDB Endow. 2(1), 946–957 (2009)
Google Scholar

Download references

Acknowledgements

This work was supported by the Spanish Government, in part under Grant RTI2018-095094-B-C22 “CONSENT” and in part under Grant TIN2014-57364-C2-2-R “SMARTGLACIS”.

Author information

Authors and Affiliations

Center for Cybersecurity Research of Catalonia (CYBERCAT), Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya (UOC), Barcelona, Spain
Jordi Casas-Roma

Authors

Jordi Casas-Roma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jordi Casas-Roma.

Ethics declarations

Conflict of interest

Author Jordi Casas-Roma declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Casas-Roma, J. DUEF-GA: data utility and privacy evaluation framework for graph anonymization. Int. J. Inf. Secur. 19, 465–478 (2020). https://doi.org/10.1007/s10207-019-00469-4

Download citation

Published: 23 September 2019
Issue Date: August 2020
DOI: https://doi.org/10.1007/s10207-019-00469-4

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

How to Quantify Graph De-anonymization Risks

k-Degree anonymity and edge selection: improving data utility in large networks

A survey of graph-modification techniques for privacy-preserving on networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

DUEF-GA: data utility and privacy evaluation framework for graph anonymization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

How to Quantify Graph De-anonymization Risks

k-Degree anonymity and edge selection: improving data utility in large networks

A survey of graph-modification techniques for privacy-preserving on networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now