[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3372297.3423363acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Estimating g-Leakage via Machine Learning

Published: 02 November 2020 Publication History

Abstract

This paper considers the problem of estimating the information leakage of a system in the black-box scenario, i.e. when the system's internals are unknown to the learner, or too complicated to analyze, and the only available information are pairs of input-output data samples, obtained by submitting queries to the system or provided by a third party. The frequentist approach relies on counting the frequencies to estimate the input-output conditional probabilities, however this method is not accurate when the domain of possible outputs is large. To overcome this difficulty, the estimation of the Bayes error of the ideal classifier was recently investigated using Machine Learning (ML) models, and it has been shown to be more accurate thanks to the ability of those models to learn the input-output correspondence. However, the Bayes vulnerability is only suitable to describe one-try attacks. A more general and flexible measure of leakage is the g-vulnerability, which encompasses several different types of adversaries, with different goals and capabilities. We propose a novel approach to perform black-box estimation of the g-vulnerability using ML which does not require to estimate the conditional probabilities and is suitable for a large class of ML algorithms. First, we formally show the learnability for all data distributions. Then, we evaluate the performance via various experiments using k-Nearest Neighbors and Neural Networks. Our approach outperform the frequentist one when the observables domain is large.

Supplementary Material

MOV File (Copy of CCS_fpe465_Marco Romanelli - Andrew Diehl.mov)
Presentation video

References

[1]
2011. Gowalla dataset. https://snap.stanford.edu/data/loc-Gowalla.html.
[2]
Mário S. Alvim, Konstantinos Chatzikokolakis, Annabelle McIver, Carroll Morgan, Catuscia Palamidessi, and Geoffrey Smith. 2014. Additive and Multiplicative Notions of Leakage, and Their Capacities. In IEEE 27th Computer Security Foundations Symposium, CSF 2014, Vienna, Austria, 19--22 July, 2014. IEEE, 308--322. https://doi.org/10.1109/CSF.2014.29
[3]
Mário S. Alvim, Konstantinos Chatzikokolakis, Annabelle McIver, Carroll Morgan, Catuscia Palamidessi, and Geoffrey Smith. 2016. Axioms for Information Leakage. In Proceedings of the 29th IEEE Computer Security Foundations Symposium (CSF). 77--92. https://doi.org/10.1109/CSF.2016.13
[4]
Mário S. Alvim, Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Geoffrey Smith. 2012. Measuring Information Leakage Using Generalized Gain Functions. In 25th IEEE Computer Security Foundations Symposium, CSF 2012, Cambridge, MA, USA, June 25--27, 2012, Stephen Chong (Ed.). IEEE Computer Society, 265--279. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6265867
[5]
Christopher M. Bishop. 2007. Pattern recognition and machine learning, 5th Edition .Springer. I--XX, 1--738 pages. http://www.worldcat.org/oclc/71008143
[6]
Nicolás E. Bordenabe and Geoffrey Smith. 2016. Correlated Secrets in Quantitative Information Flow. In CSF. IEEE Computer Society, 93--104. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7518122
[7]
Michele Boreale. 2009. Quantifying information leakage in process calculi. Inf. Comput, Vol. 207, 6 (2009), 699--725.
[8]
S. Boucheron, G. Lugosi, and P. Massart. 2013. Concentration Inequalities: A Nonasymptotic Theory of Independence .OUP Oxford.
[9]
Konstantinos Chatzikokolakis, Tom Chothia, and Apratim Guha. 2010. Statistical Measurement of Information Leakage. In TACAS (Lecture Notes in Computer Science, Vol. 6015). Springer, 390--404.
[10]
Konstantinos Chatzikokolakis, Natasha Fernandes, and Catuscia Palamidessi. 2019. Comparing systems: max-case refinement orders and application to differential privacy. In Proceedings of the 32nd IEEE Computer Security Foundations Symposium. Hoboken, United States, 442--457. https://doi.org/10.1109/CSF.2019.00037
[11]
Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Prakash Panangaden. 2008a. Anonymity protocols as noisy channels. Inf. Comput, Vol. 206, 2--4 (2008), 378--401.
[12]
Konstantinos Chatzikokolakis, Catuscia Palamidessi, and Prakash Panangaden. 2008b. On the Bayes risk in information-hiding protocols. Journal of Computer Security, Vol. 16, 5 (2008), 531--571. https://doi.org/10.3233/JCS-2008-0333
[13]
Giovanni Cherubin. 2017. Bayes, not Naive: Security Bounds on Website Fingerprinting Defenses. PoPETs, Vol. 2017, 4 (2017), 215--231.
[14]
Giovanni Cherubin, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. 2019. F-BLEAU: Fast Black-box Leakage Estimation. IEEE Symposium on Security and Privacy, Vol. abs/1902.01350 (2019). http://arxiv.org/abs/1902.01350
[15]
Giovanni Cherubin, Jamie Hayes, and Marc Juárez. 2017. Website Fingerprinting Defenses at the Application Layer. PoPETs, Vol. 2017, 2 (2017), 186--203. https://doi.org/10.1515/popets-2017-0023
[16]
Lénaic Chizat and Francis Bach. 2020. Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss. In Conference on Learning Theory, COLT 2020, 9--12 July 2020, Virtual Event [Graz, Austria] (Proceedings of Machine Learning Research, Vol. 125), Jacob D. Abernethy and Shivani Agarwal (Eds.). PMLR, 1305--1338. http://proceedings.mlr.press/v125/chizat20a.html
[17]
Tom Chothia and Apratim Guha. 2011. A Statistical Test for Information Leaks Using Continuous Mutual Information. In CSF. IEEE Computer Society, 177--190. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5991608
[18]
Tom Chothia, Yusuke Kawamoto, and Chris Novakovic. 2013. A tool for estimating information leakage. In International Conference on Computer Aided Verification (CAV). Springer, 690--695.
[19]
Tom Chothia, Yusuke Kawamoto, and Chris Novakovic. 2014. LeakWatch: Estimating Information Leakage from Java Programs. In ESORICS (2) (Lecture Notes in Computer Science, Vol. 8713), Miroslaw Kutylowski and Jaideep Vaidya (Eds.). Springer, 219--236.
[20]
David Clark, Sebastian Hunt, and Pasquale Malacaria. 2001. Quantitative Analysis of the Leakage of Confidential Data. Electr. Notes Theor. Comput. Sci, Vol. 59, 3 (2001), 238--251.
[21]
Eloi de Chérisey, Sylvain Guilley, Olivier Rioul, and Pablo Piantanida. 2019. Best Information is Most Successful - Mutual Information and Success Rate in Side-Channel Analysis. IACR Trans. Cryptogr. Hardw. Embed. Syst., Vol. 2019, 2 (2019), 49--79. https://doi.org/10.13154/tches.v2019.i2.49--79
[22]
Luc Devroye, László Györfi, and Gábor Lugosi. 1996. Vapnik-Chervonenkis Theory .Springer New York, New York, NY, 187--213. https://doi.org/10.1007/978--1--4612-0711--5_12
[23]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository (Heart Disease Data Set). https://archive.ics.uci.edu/ml/datasets/heart+Disease
[24]
Cynthia Dwork. 2006. Differential Privacy. In 33rd International Colloquium on Automata, Languages and Programming (ICALP 2006) (Lecture Notes in Computer Science, Vol. 4052), Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener (Eds.). Springer, 1--12. http://dx.doi.org/10.1007/11787006_1
[25]
Cynthia Dwork, Frank Mcsherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In In Proceedings of the Third Theory of Cryptography Conference (TCC) (Lecture Notes in Computer Science, Vol. 3876), Shai Halevi and Tal Rabin (Eds.). Springer, 265--284.
[26]
Ehab ElSalamouny and Catuscia Palamidessi. 2020. Full Convergence of the Iterative Bayesian Update and Applications to Mechanisms for Privacy Protection. arxiv: 1909.02961 [cs.CR] To appear in the proceedings of EuroS&P.
[27]
Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning .MIT Press. 1--775 pages. http://www.deeplearningbook.org/
[28]
T. Hastie, R. Tibshirani, and J. Friedman. 2001. The Elements of Statistical Learning: Data Mining, Inference and Prediction .Springer-Verlag.
[29]
Boris Köpf and David A. Basin. 2007. An information-theoretic model for adaptive side-channel attacks. In Proceedings of the 2007 ACM Conference on Computer and Communications Security, CCS 2007, Alexandria, Virginia, USA, October 28--31, 2007, Peng Ning, Sabrina De Capitani di Vimercati, and Paul F. Syverson (Eds.). ACM, 286--296.
[30]
Boris Köpf and Markus Dürmuth. 2009. A Provably Secure and Efficient Countermeasure against Timing Attacks. In Proceedings of the 2009 22nd IEEE Computer Security Foundations Symposium (CSF '09). IEEE Computer Society, USA, 324--335. https://doi.org/10.1109/CSF.2009.21
[31]
Marco Romanelli, Catuscia Palamidessi, and Konstantinos Chatzikokolakis. 2020. Generating Optimal Privacy-Protection Mechanisms via Machine Learning, In Proceedings of the IEEE International Symposium on Computer Security Foundations (CSF). CoRR .arxiv: 1904.01059 http://arxiv.org/abs/1904.01059
[32]
Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding machine learning : from theory to algorithms. http://www.worldcat.org/search?qt=worldcat_org_all&q=9781107057135
[33]
Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux. 2011. Quantifying Location Privacy. In IEEE Symposium on Security and Privacy. IEEE Computer Society, 247--262. http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5955408; http://www.computer.org/csdl/proceedings/sp/2011/4402/00/index.html
[34]
Reza Shokri, George Theodorakopoulos, Carmela Troncoso, Jean-Pierre Hubaux, and Jean-Yves Le Boudec. 2012. Protecting location privacy: optimal strategy against localization attacks. In the ACM Conference on Computer and Communications Security, CCS'12, Raleigh, NC, USA, October 16--18, 2012, Ting Yu, George Danezis, and Virgil D. Gligor (Eds.). ACM, 617--627. http://dl.acm.org/citation.cfm?id=2382196
[35]
Geoffrey Smith. 2009. On the Foundations of Quantitative Information Flow. In Foundations of Software Science and Computational Structures, 12th International Conference, FOSSACS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York, UK, March 22--29, 2009. Proceedings (Lecture Notes in Computer Science, Vol. 5504), Luca de Alfaro (Ed.). Springer, 288--302.
[36]
David H. Wolpert. 1996. The Lack of A Priori Distinctions Between Learning Algorithms. Neural Computation, Vol. 8, 7 (1996), 1341--1390.

Cited By

View all
  • (2025)Information Leakage Measures for Imperfect Statistical Information: Application to Non-Bayesian FrameworkIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351658520(1065-1080)Online publication date: 2025
  • (2024)A Black-Box Approach for Quantifying Leakage of Trace-Based Correlated DataProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3694722(1980-1985)Online publication date: 4-Dec-2024
  • (2024)Evaluation of Privacy-Utility Tradeoff in Generative Adversarial Network Variants2024 12th International Symposium on Digital Forensics and Security (ISDFS)10.1109/ISDFS60797.2024.10527266(1-6)Online publication date: 29-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security
October 2020
2180 pages
ISBN:9781450370899
DOI:10.1145/3372297
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. g-vulnerability estimation
  2. machine learning
  3. neural networks

Qualifiers

  • Research-article

Funding Sources

  • European Research Council
  • European Commission?s Marie Sklodowska-Curie Actions (MSCA)

Conference

CCS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)12
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Information Leakage Measures for Imperfect Statistical Information: Application to Non-Bayesian FrameworkIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351658520(1065-1080)Online publication date: 2025
  • (2024)A Black-Box Approach for Quantifying Leakage of Trace-Based Correlated DataProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3694722(1980-1985)Online publication date: 4-Dec-2024
  • (2024)Evaluation of Privacy-Utility Tradeoff in Generative Adversarial Network Variants2024 12th International Symposium on Digital Forensics and Security (ISDFS)10.1109/ISDFS60797.2024.10527266(1-6)Online publication date: 29-Apr-2024
  • (2024)Symbolic Quantitative Information Flow for Probabilistic ProgramsPrinciples of Verification: Cycling the Probabilistic Landscape10.1007/978-3-031-75783-9_6(128-154)Online publication date: 13-Nov-2024
  • (2023)Measures of Information Leakage for Incomplete Statistical Information: Application to a Binary Privacy MechanismACM Transactions on Privacy and Security10.1145/362498226:4(1-31)Online publication date: 13-Nov-2023
  • (2023)"Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security ConferencesProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623130(3433-3459)Online publication date: 15-Nov-2023
  • (2023)Intelligent Models for Forecasting Repair Timing of Leakage Water Pipelines2023 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)10.1109/MIUCC58832.2023.10278375(255-260)Online publication date: 27-Sep-2023
  • (2023)Variations and Extensions of Information Leakage Metrics with Applications to Privacy Problems with Imperfect Statistical Information2023 IEEE 36th Computer Security Foundations Symposium (CSF)10.1109/CSF57540.2023.00007(407-422)Online publication date: Jul-2023
  • (2023)Exact and Efficient Bayesian Inference for Privacy Risk QuantificationSoftware Engineering and Formal Methods10.1007/978-3-031-47115-5_15(263-281)Online publication date: 6-Nov-2023
  • (2022)Information Leakage in Index Coding With Sensitive and Nonsensitive MessagesIEEE Journal on Selected Areas in Information Theory10.1109/JSAIT.2022.32321263:4(803-814)Online publication date: Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media