Correlated network data publication via differential privacy

Rui Chen¹,
Benjamin C. M. Fung²,
Philip S. Yu³ &
…
Bipin C. Desai⁴

4108 Accesses
139 Citations
Explore all metrics

Abstract

With the increasing prevalence of information networks, research on privacy-preserving network data publishing has received substantial attention recently. There are two streams of relevant research, targeting different privacy requirements. A large body of existing works focus on preventing node re-identification against adversaries with structural background knowledge, while some other studies aim to thwart edge disclosure. In general, the line of research on preventing edge disclosure is less fruitful, largely due to lack of a formal privacy model. The recent emergence of differential privacy has shown great promise for rigorous prevention of edge disclosure. Yet recent research indicates that differential privacy is vulnerable to data correlation, which hinders its application to network data that may be inherently correlated. In this paper, we show that differential privacy could be tuned to provide provable privacy guarantees even in the correlated setting by introducing an extra parameter, which measures the extent of correlation. We subsequently provide a holistic solution for non-interactive network data publication. First, we generate a private vertex labeling for a given network dataset to make the corresponding adjacency matrix form dense clusters. Next, we adaptively identify dense regions of the adjacency matrix by a data-dependent partitioning process. Finally, we reconstruct a noisy adjacency matrix by a novel use of the exponential mechanism. To our best knowledge, this is the first work providing a practical solution for publishing real-life network data via differential privacy. Extensive experiments demonstrate that our approach performs well on different types of real-life network datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

dK-Projection: Publishing Graph Joint Degree Distribution with Node Differential Privacy

Research on Social Networks Publishing Method Under Differential Privacy

Two-Stage Framework for Accurate and Differentially Private Network Information Publication

Notes

The strong attacker mentioned in [24] cannot be prevented as his prior knowledge (without accessing any database) has allowed him to succeed in a privacy attack. Such a privacy attack is not caused by data sharing and therefore beyond the scope of this paper.
A random vertex labeling satisfies 0-differential privacy because it can be viewed as an application of the exponential mechanism with privacy parameter 0.
If \(v_i\) and \(v_j\) are selected to swap with each other, only one query will be affected. This leads to a lower global sensitivity.
For a region \(R\) not containing elements on the diagonal, \(GS(f)=1\) because a single edge difference cannot change two elements in this region.
ca-GrQc, ca-HepTh and wiki-Vote are publicly available in the Stanford large network dataset collection (http://snap.stanford.edu/data/index.html). STM is provided by the Société de transport de Montréal (http://www.stm.info).
Laplace can barely provide any useful information in terms of degree distribution because its KL-divergence is almost the same as an empty graph (i.e., \(|E| = 0\)).
Step 1 means that sampling is not employed.

References

Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography. In: Proceedings of the 16th International Conference on World Wide Web (WWW), pp. 181–190 (2007)
Bearman, P.S., Moody, J., Stovel, K.: Chains of affection: the structure of adolescent romantic and sexual networks. Am. J. Sociol. 110(1), 44–91 (2004)
Article Google Scholar
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Article MATH MathSciNet Google Scholar
Bhagat, S., Cormode, G., Krishnamurthy, B., Srivastava, D.: Class-based graph anonymization for social network data. Proc. VLDB Endow. 2(1), 766–777 (2009)
Article Google Scholar
Chen, R., Mohammed, N., Fung, B.C.M., Desai, B.C., Xiong, L.: Publishing set-valued data via differential privacy. Proc. VLDB Endow. 4(11), 1087–1098 (2011)
Google Scholar
Cheng, J., Fu, A.W.C., Liu, J.: K-isomorphism: privacy preserving network publication against structural attacks. In: Proceedings of the 36th ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 459–470 (2010)
Cormen, T.H., Stein, C., Rivest, R.L., Leiserson, C.E.: Introduction to Algorithms, 3rd edn. McGraw-Hill Higher Education, New York (2009)
MATH Google Scholar
Cormode, G., Srivastava, D., Yu, T., Zhang, Q.: Anonymizing bipartite graph data using safe groupings. Proc. VLDB Endow. 1(1), 833–844 (2008)
Article Google Scholar
Cormode, G., Procopiuc, M., Shen, E., Srivastava, D., Yu, T.: Differentially private spatial decompositions. In: Proceedings of the 27th IEEE International Conference on Data Engineering (ICDE), pp. 20–31 (2012)
Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference, pp. 157–172 (1969)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference (TCC), pp. 265–284 (2006)
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: Proceedings of the 23rd ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pp. 251–262 (1999)
Finkel, R.A., Bentley, J.L.: Quad trees: a data structure for retrieval on composite keys. Acta Informatica 4(1), 1–9 (1974)
Article MATH Google Scholar
Fortunato, S.: Community Detection in Graphs. CoRR abs/0906.0612 (2009)
Gupta, A., Ligett, K., McSherry, F., Roth, A., Talwar, K.: Differentially private combinatorial optimization. In: Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1106–1125 (2010)
Gupta, A., Roth, A., Ullman, J.: Iterative constructions and private data release. In: Proceedings of the 9th Theory of Cryptography Conference (TCC), pp. 339–356 (2012)
Hay, M., Miklau, G., Jensen, D., Towsley, D.F., Weis, P.: Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow. 1(1), 102–114 (2008)
Article Google Scholar
Hay, M., Li, C., Miklau, G., Jensen, D.: Accurate estimation of the degree distribution of private networks. In: Proceedings of the 9th IEEE International Conference on Data Mining (ICDM), pp. 169–178 (2009)
Hay, M., Rastogi, V., Miklau, G., Suciu, D.: Boosting the accuracy of differentially private histograms through consistency. Proc. VLDB Endow. 3(1), 1021–1032 (2010)
Article Google Scholar
Hay, M., Liu, K., Miklau, G., Pei, J., Terzi, E.: Privacy-aware data management in information networks. In: Proceedings of the 37th ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 1201–1204 (2011)
Kamel, I., Faloutsos, C.: Hilbert R-tree: an improved R-tree using fractals. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), pp. 500–509 (1994)
Karwa, V., Raskhodnikova, S., Smith, A., Yaroslavtsev, G.: Private analysis of graph structure. Proc VLDB Endow. 4(11), 1146–1157 (2011)
Google Scholar
Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: Proceedings of the 32nd ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 217–228 (2006)
Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Proceedings of the 37th ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 193–204 (2011)
Kossinets, G., Watts, D.: Empirical analysis of an evolving social networks. Science 311(5757), 88–90 (2006)
Article MATH MathSciNet Google Scholar
Liu, K., Terzi, E.: Towards identity anonymization on graphs. In: Proceedings of the 34th ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 93–106 (2008)
Liu, L., Wang, J., Liu, J., Zhang, J.: Privacy preservation in social networks with sensitive edge weights. In: Proceedings of the 9th SIAM International Conference on Data Mining (SDM), pp. 954–965 (2009)
McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 35th ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 19–30 (2009)
McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceedings of the 48th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 94–103 (2007)
Mohammed, N., Chen, R., Fung, B.C.M., Yu, P.S.: Differentially private data release for data mining. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 493–501 (2011)
Proserpio, D., Goldberg, S., McSherry, F.: A workflow for differentially-private graph synthesis. In: Proceedings of the 2012 ACM Workshop on Online Social Networks (WOSN), pp. 13–18 (2012)
Sala, A., Zhao, X., Wilson, C., Zheng, H., Zhao, B.Y.: Sharing graphs using differentially private graph models. In: Proceedings of the 11th ACM SIGCOMM Conference on Internet Measurement (IMC), pp. 81–98 (2011)
Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)
Article MathSciNet Google Scholar
Wu, L., Ying, X., Wu, X.: Reconstruction from randomized graph via low rank approximation. In: Proceedings of the 10th SIAM International Conference on Data Mining (SDM), pp. 60–71 (2010)
Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. In: Proceedings of the 26th IEEE International Conference on Data Engineering (ICDE), pp. 225–236 (2010)
Xiao, X., Bender, G., Hay, M., Gehrke, J.: iReduct: differential privacy with reduced relative errors. In: Proceedings of the 37th ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 229–240 (2011)
Ying, X., Wu, X.: Randomizing social networks: a spectrum preserving approach. In: Proceedings of the 8th SIAM International Conference on Data Mining (SDM), pp. 739–750 (2008)
Yuan, M., Chen, L., Yu, P.S.: Personalized privacy protection in social networks. Proc. VLDB Endow. 4(2), 141–150 (2011)
Article Google Scholar
Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE), pp. 506–515 (2008)
Zou, L., Chen, L., Ozsu, M.T.: K-automorphism: a general framework for privacy preserving network publication. Proc. VLDB Endow. 2(1), 946–957 (2009)
Article Google Scholar

Download references

Acknowledgments

We sincerely thank the reviewers for their insightful comments. We thank James Cheng, Ada Wai-Chee Fu and Jia Liu for providing the source code of \(k\)-isomorphism. The research is supported in part by NSERC through Discovery Grants (356065-2013), US NSF through grants CNS-1115234, DBI-0960443 and OISE-1129076 and US Department of Army through grant W911NF-12-1-0066.

Author information

Authors and Affiliations

Hong Kong Baptist University, Kowloon, Hong Kong
Rui Chen
McGill University, Montreal, Canada
Benjamin C. M. Fung
University of Illinois at Chicago, Chicago, IL, USA
Philip S. Yu
Concordia University, Montreal, Canada
Bipin C. Desai

Authors

Rui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin C. M. Fung
View author publications
You can also search for this author in PubMed Google Scholar
Philip S. Yu
View author publications
You can also search for this author in PubMed Google Scholar
Bipin C. Desai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, R., Fung, B.C.M., Yu, P.S. et al. Correlated network data publication via differential privacy. The VLDB Journal 23, 653–676 (2014). https://doi.org/10.1007/s00778-013-0344-8

Download citation

Received: 29 November 2012
Revised: 22 October 2013
Accepted: 24 October 2013
Published: 15 November 2013
Issue Date: August 2014
DOI: https://doi.org/10.1007/s00778-013-0344-8

Correlated network data publication via differential privacy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

dK-Projection: Publishing Graph Joint Degree Distribution with Node Differential Privacy

Research on Social Networks Publishing Method Under Differential Privacy

Two-Stage Framework for Accurate and Differentially Private Network Information Publication

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Correlated network data publication via differential privacy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

dK-Projection: Publishing Graph Joint Degree Distribution with Node Differential Privacy

Research on Social Networks Publishing Method Under Differential Privacy

Two-Stage Framework for Accurate and Differentially Private Network Information Publication

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation