[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Efficient protocols for heavy hitter identification with local differential privacy

Published: 01 October 2022 Publication History

Abstract

Local differential privacy (LDP), which is a technique that employs unbiased statistical estimations instead of real data, is usually adopted in data collection, as it can protect every user’s privacy and prevent the leakage of sensitive information. The segment pairs method (SPM), multiple-channel method (MCM) and prefix extending method (PEM) are three known LDP protocols for heavy hitter identification as well as the frequency oracle (FO) problem with large domains. However, the low scalability of these three LDP algorithms often limits their application. Specifically, communication and computation strongly affect their efficiency. Moreover, excessive grouping or sharing of privacy budgets makes the results inaccurate. To address the above-mentioned problems, this study proposes independent channel (IC) and mixed independent channel (MIC), which are efficient LDP protocols for FO with a large domains. We design a flexible method for splitting a large domain to reduce the number of sub-domains. Further, we employ the false positive rate with interaction to obtain an accurate estimation. Numerical experiments demonstrate that IC outperforms all the existing solutions under the same privacy guarantee while MIC performs well under a small privacy budget with the lowest communication cost.

References

[1]
Cormode G, Jha S, Kulkarni T, Li N, Srivastava D, Wang T. Privacy at scale: local differential privacy in practice. In: Proceedings of 2018 International Conference on Management of Data. 2018, 1655–1658
[2]
Peter K, Sewoong O, and Pramod V Extremal mechanisms for local differential privacy Advances in Neural Information Processing Systems (NIPS) 2014 4 2879-2887 January
[3]
Ye M and Barg A Optimal schemes for discrete distribution estimation under locally differential privacy IEEE Transactions on Information Theory 2018 64 8 5662-5676
[4]
Acharya J, Sun Z, Zhang H. Communication efficient, sample optimal, linear time locally private discrete distribution estimation. 2018, arXiv preprint arXiv: 1802.04705
[5]
Bassily R, Smith A. Local, private, efficient protocols for succinct histograms. In: Proceedings of the 47th Annual ACM Symposium on Theory of Computing. 2015, 127–135
[6]
Erlingsson Ú, Pihur V, Korolova A. RAPPOR: randomized aggregatable privacy-preserving ordinal response. In: Proceedings of 2014 ACM SIGSAC Conference on Computer and Communications Security. 2014, 1054–1067
[7]
Abhradeep G T, Andrew H V, Umesh S V, Gaurav K, Julien F, Vivek R S, Doug D. Learning new words. US Patent, No. 9594741, 14 Mar. 2017
[8]
Ding B, Kulkarni J, Yekhanin S. Collecting telemetry data privately. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 3574–3583
[9]
Mishra N, Sandler M. Privacy via pseudorandom sketches. In: Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. 2006, 143–152
[10]
Wang T, Li N, and Jha S Locally differentially private heavy hitter identification IEEE Transactions on Dependable and Secure Computing 2021 18 2 982-993
[11]
Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. 2006, 265–284
[12]
Chatzikokolakis K, Andrés M E, Bordenabe N E, Palamidessi C. Broadening the scope of differential privacy using metrics. In: Proceedings of the 13th International Symposium on Privacy Enhancing Technologies Symposium. 2013, 82–102
[13]
Kifer D, Machanavajjhala A. A rigorous and customizable framework for privacy. In: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 2012, 77–88
[14]
He X, Machanavajjhala A, Ding B. Blowfish privacy: tuning privacy-utility trade-offs using policies. In: Proceedings of 2014 ACM SIGMOD International Conference on Management of Data. 2014, 1447–1458
[15]
Bun M, Steinke T. Concentrated differential privacy: simplifications, extensions, and lower bounds. In: Proceedings of the 14th International Conference on Theory of Cryptography. 2016, 635–658
[16]
Jorgensen Z, Yu T, Cormode G. Conservative or liberal? Personalized differential privacy. In: Proceedings of the IEEE the 31st International Conference on Data Engineering. 2015, 1023–1034
[17]
Dwork C. Differential privacy: a survey of results. In: Proceedings of the 5th International Conference on Theory and Applications of Models of Computation. 2008, 1–19
[18]
Qin Z, Yang Y, Yu T, Khalil I, Xiao X, Ren K. Heavy hitter estimation over set-valued data with local differential privacy. In: Proceedings of 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016, 192–203
[19]
Duchi J C, Jordan M I, Wainwright M J. Local privacy and minimax bounds: sharp rates for probability estimation. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 1529–1537
[20]
Hsu J, Khanna S, Roth A. Distributed private heavy hitters. In: Proceedings of the 39th International Colloquium on Automata, Languages, and Programming. 2012, 461–472
[21]
Kasiviswanathan S P, Lee H K, Nissim K, Raskhodnikova S, and Smith A What can we learn privately? SIAM Journal on Computing 2011 40 3 793-826
[22]
Li Z, Wang T, Lopuhaä-Zwakenberg M, Li N, Škoric B. Estimating numerical distributions under local differential privacy. In: Proceedings of 2020 ACM SIGMOD International Conference on Management of Data. 2020, 621–635
[23]
Jia J, Gong N Z. Calibrate: frequency estimation and heavy hitter identification with local differential privacy via incorporating prior knowledge. In: Proceedings of 2019 IEEE Conference on Computer Communications. 2019, 2008–2016
[24]
Duchi J C, Jordan M I, and Wainwright M J Minimax optimal procedures for locally private estimation Journal of the American Statistical Association 2018 113 521 182-201
[25]
Wang N, Xiao X, Yang Y, Zhao J, Hui S C, Shin H, Shin J, Yu G. Collecting and analyzing multidimensional data with local differential privacy. In: Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE). 2019, 638–649
[26]
Wang S, Qian Y, Du J, Yang W, Huang L, and Xu H Set-valued data publication with local privacy: tight error bounds and efficient mechanisms Proceedings of the VLDB Endowment 2020 13 8 1234-1247
[27]
Ye Q, Hu H, Meng X, Zheng H. PrivKV: key-value data collection with local differential privacy. In: Proceedings of 2019 IEEE Symposium on Security and Privacy (SP). 2019, 317–331
[28]
Gu X, Li M, Cheng Y, Xiong L, Cao Y. PCKV: locally differentially private correlated key-value data collection with optimized utility. In: Proceedings of the 29th USENIX Conference on Security Symposium. 2020, 55
[29]
Cormode G, Kulkarni T, and Srivastava D Answering range queries under local differential privacy Proceedings of the VLDB Endowment 2019 12 10 1126-1138
[30]
Wang N, Xiao X, Yang Y, Hoang T D, Shin H, Shin J, Yu G. PrivTrie: effective frequent term discovery under local differential privacy. In: Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE). 2018, 821–832
[31]
Wang T, Lopuhaä-Zwakenberg M, Li Z, Skoric B, Li N. Locally differentially private frequency estimation with consistency. In: Proceedings of the 27th Annual Network and Distributed System Security Symposium. 2020
[32]
Chen R, Li H, Qin A K, Kasiviswanathan S P, Jin H. Private spatial data aggregation in the local setting. In: Proceedings of the 32nd IEEE International Conference on Data Engineering (ICDE). 2016, 289–300
[33]
Nie Y, Yang W, Huang L, Xie X, Zhao Z, and Wang S A utility-optimized framework for personalized private histogram estimation IEEE Transactions on Knowledge and Data Engineering 2019 31 4 655-669
[34]
Andrés M E, Bordenabe N E, Chatzikokolakis K, Palamidessi C. Geoindistinguishability: Differential privacy for location-based systems. In: Proceedings of 2013 ACM SIGSAC Conference on Computer & Communications Security. 2013, 901–914
[35]
Wang S, Nie Y, Wang P, Xu H, Yang W, Huang L. Local private ordinal data distribution estimation. In: Proceedings of 2017 IEEE Conference on Computer Communications. 2017, 1–9
[36]
Gu X L, Li M, Cao Y, Xiong L. Supporting both range queries and frequency estimation with local differential privacy. In: Proceedings of 2019 IEEE Conference on Communications and Network Security (CNS). 2019, 124–132
[37]
Gursoy M E, Tamersoy A, Truex S, Wei W, and Liu L Secure and utility-aware data collection with condensed local differential privacy IEEE Transactions on Dependable and Secure Computing 2021 18 5 2365-2378
[38]
Murakami T, Kawamoto Y. Utility-optimized local differential privacy mechanisms for distribution estimation. In: Proceedings of the 28th USENIX Conference on Security Symposium. 2019, 1877–1894
[39]
Balle B, Bell J, Gascón A, Nissim K. The privacy blanket of the shuffle model. In: Proceedings of the 39th Annual International Cryptology Conference. 2019, 638–667
[40]
Cheu A, Smith A, Ullman J, Zeber D, Zhilyaev M. Distributed differential privacy via shuffling. In: Proceedings of the 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques. 2019, 375–403
[41]
Erlingsson Ú, Feldman V, Mironov I, Raghunathan A, Talwar K, Thakurta A. Amplification by shuffling: from local to central differential privacy via anonymity. In: Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms. 2019, 2468–2479
[42]
Wang T, Blocki J, Li N, Jha S. Locally differentially private protocols for frequency estimation. In: Proceedings of the 26th USENIX Security Symposium. 2017, 729–745
[43]
Acharya J, Sun Z, Zhang H. Hadamard response: estimating distributions privately, efficiently, and with little communication. In: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. 2019, 1120–1129
[44]
Fanti G, Pihur V, and Erlingsson Ú Building a RAPPOR with the unknown: privacy-preserving learning of associations and data dictionaries Proceedings on Privacy Enhancing Technologies 2016 2016 3 41-61
[45]
Warner S L Randomized response: a survey technique for eliminating evasive answer bias Journal of the American Statistical Association 1965 60 309 63-69
[46]
Bun M, Nelson J, and Stemmer U Heavy hitters and the structure of local privacy ACM Transactions on Algorithms 2019 15 4 51
[47]
Kairouz P, Bonawitz K A, Ramage D. Discrete distribution estimation under local privacy. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 2436–2444
[48]
Bassily R, Nissim K, Stemmer U, Thakurta A. Practical locally private heavy hitters. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 2285–2293
[49]
Thakurta A G, Vyrros A H, Vaishampayan U S, Kapoor G, Freudinger J, Prakash V V, Legendre A, Duplinsky S. Emoji frequency detection and deep link frequency: 9705908. 2020-07-11
[50]
Beimel A, Nissim K, Stemmer U. Private learning and sanitization: pure vs. approximate differential privacy. In: Proceedings of the 16th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. 2013, 363–378
[51]
Bun M, Nissim K, Stemmer U, Vadhan S. Differentially private release and learning of threshold functions. In: Proceedings of the 56th IEEE Annual Symposium on Foundations of Computer Science. 2015, 634–649
[52]
Cormode G, Kulkarni T, Srivastava D. Marginal release under local differential privacy. In: Proceedings of 2018 International Conference on Management of Data. 2018, 131–146
[53]
Nguyên T T, Xiao X, Yang Y, Hui S C, Shin H, Shin J. Collecting and analyzing data from smart device users with local differential privacy. 2016, arXiv preprint arXiv: 1606.05053
[54]
Manning C D, Raghavan P, and Schütze H Introduction to Information Retrieval 2008 Cambridge Cambridge University Press

Cited By

View all
  • (2023)Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries SketchProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588673(79-88)Online publication date: 18-Jun-2023

Index Terms

  1. Efficient protocols for heavy hitter identification with local differential privacy
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Frontiers of Computer Science: Selected Publications from Chinese Universities
        Frontiers of Computer Science: Selected Publications from Chinese Universities  Volume 16, Issue 5
        Oct 2022
        220 pages
        ISSN:2095-2228
        EISSN:2095-2236
        Issue’s Table of Contents

        Publisher

        Springer-Verlag

        Berlin, Heidelberg

        Publication History

        Published: 01 October 2022
        Accepted: 16 June 2021
        Received: 17 August 2020

        Author Tags

        1. local differential privacy
        2. frequency oracle
        3. heavy hitter

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 12 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Better Differentially Private Approximate Histograms and Heavy Hitters using the Misra-Gries SketchProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588673(79-88)Online publication date: 18-Jun-2023

        View Options

        View options

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media