[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3372297.3417278acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

Zero Knowledge Proofs for Decision Tree Predictions and Accuracy

Published: 02 November 2020 Publication History

Abstract

Machine learning has become increasingly prominent and is widely used in various applications in practice. Despite its great success, the integrity of machine learning predictions and accuracy is a rising concern. The reproducibility of machine learning models that are claimed to achieve high accuracy remains challenging, and the correctness and consistency of machine learning predictions in real products lack any security guarantees. In this paper, we initiate the study of zero knowledge machine learning and propose protocols for zero knowledge decision tree predictions and accuracy tests. The protocols allow the owner of a decision tree model to convince others that the model computes a prediction on a data sample, or achieves a certain accuracy on a public dataset, without leaking any information about the model itself. We develop approaches to efficiently turn decision tree predictions and accuracy into statements of zero knowledge proofs. We implement our protocols and demonstrate their efficiency in practice. For a decision tree model with 23 levels and 1,029 nodes, it only takes 250 seconds to generate a zero knowledge proof proving that the model achieves high accuracy on a dataset of 5,000 samples and 54 attributes, and the proof size is around 287 kilobytes.

Supplementary Material

MOV File (Copy of CCS2020_fpx268_Jiaheng Zhang - Andrew Diehl.mov)
Presentation video

References

[1]
Miklós Ajtai. 1996. Generating hard instances of lattice problems. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing. ACM, 99--108.
[2]
Scott Ames, Carmit Hazay, Yuval Ishai, and Muthuramakrishnan Venkitasubramaniam. 2017. Ligero: Lightweight sublinear arguments without a trusted setup. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security .
[3]
Stephanie Bayer and Jens Groth. 2012. Efficient zero-knowledge argument for correctness of a shuffle. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 263--280.
[4]
Eli Ben-Sasson, Iddo Bentov, Yinon Horesh, and Michael Riabzev. 2019 a. Scalable zero knowledge with no trusted setup. In Annual International Cryptology Conference . Springer, 701--732.
[5]
Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, and Madars Virza. [n.d.] a. SNARKs for C: Verifying program executions succinctly and in zero knowledge. In CRYPTO 2013 .
[6]
Eli Ben-Sasson, Alessandro Chiesa, Michael Riabzev, Nicholas Spooner, Madars Virza, and Nicholas P. Ward. 2019 b. Aurora: Transparent Succinct Arguments for R1CS. In Advances in Cryptology textendash EUROCRYPT 2019. Springer International Publishing, 103--128. https://doi.org/10.1007/978--3-030--17653--2_4
[7]
Eli Ben-Sasson, Alessandro Chiesa, Eran Tromer, and Madars Virza. [n.d.] b. Succinct Non-Interactive Zero Knowledge for a von Neumann Architecture. In Proceedings of the USENIX Security Symposium, 2014 .
[8]
Eli Ben-Sasson, Alessandro Chiesa, Eran Tromer, and Madars Virza. 2014. Scalable zero knowledge via cycles of elliptic curves. In CRYPTO 2014 . 276--294.
[9]
Jonathan Bootle, Andrea Cerulli, Pyrros Chaidos, Jens Groth, and Christophe Petit. 2016. Efficient zero-knowledge arguments for arithmetic circuits in the discrete log setting. In International Conference on the Theory and Applications of Cryptographic Techniques .
[10]
Jonathan Bootle, Andrea Cerulli, Essam Ghadafi, Jens Groth, Mohammad Hajiabadi, and Sune K Jakobsen. 2017. Linear-time zero-knowledge proofs for arithmetic circuit satisfiability. In International Conference on the Theory and Application of Cryptology and Information Security. Springer, 336--365.
[11]
Jonathan Bootle, Andrea Cerulli, Jens Groth, Sune Jakobsen, and Mary Maller. 2018. Arya: Nearly linear-time zero-knowledge proofs for correct program execution. In International Conference on the Theory and Application of Cryptology and Information Security. Springer, 595--626.
[12]
Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. 2015. Machine learning classification over encrypted data. In NDSS, Vol. 4324. 4325.
[13]
Benjamin Braun, Ariel J. Feldman, Zuocheng Ren, Srinath T. V. Setty, Andrew J. Blumberg, and Michael Walfish. [n.d.]. Verifying computations with state. In ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP, 2013 .
[14]
B. Bünz, J. Bootle, D. Boneh, A. Poelstra, P. Wuille, and G. Maxwell. [n.d.]. Bulletproofs: Short Proofs for Confidential Transactions and More. In Proceedings of the Symposium on Security and Privacy (SP), 2018, Vol. 00. 319--338.
[15]
Matteo Campanelli, Dario Fiore, and Ana"is Querol. [n.d.]. LegoSNARK: Modular Design and Composition of Succinct Zero-Knowledge Proofs. In CCS 2019 .
[16]
Matteo Campanelli, Rosario Gennaro, Steven Goldfeder, and Luca Nizzardo. 2017. Zero-knowledge contingent payments revisited: Attacks and payments for services. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security . ACM, 229--243.
[17]
Craig Costello, Cé dric Fournet, Jon Howell, Markulf Kohlweiss, Benjamin Kreuter, Michael Naehrig, Bryan Parno, and Samee Zahur. [n.d.]. Geppetto: Versatile Verifiable Computation. In S&P 2015 .
[18]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[19]
Amos Fiat and Adi Shamir. [n.d.]. How to Prove Yourself: Practical Solutions to Identification and Signature Problems. In Crypto 1986 .
[20]
Dario Fiore, Cédric Fournet, Esha Ghosh, Markulf Kohlweiss, Olga Ohrimenko, and Bryan Parno. 2016. Hash first, argue later: Adaptive verifiable computations on outsourced data. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security .
[21]
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security - CCS textquotesingle15 . ACM Press. https://doi.org/10.1145/2810103.2813677
[22]
Zahra Ghodsi, Tianyu Gu, and Siddharth Garg. 2017. Safetynets: Verifiable execution of deep neural networks on an untrusted cloud. In Advances in Neural Information Processing Systems. 4672--4681.
[23]
Shafi Goldwasser, Silvio Micali, and Charles Rackoff. 1989. The knowledge complexity of interactive proof systems. SIAM Journal on computing, Vol. 18, 1 (1989), 186--208.
[24]
Jens Groth. 2009. Linear algebra with sub-linear zero-knowledge arguments. In Advances in Cryptology-CRYPTO 2009 . Springer, 192--208.
[25]
Jens Groth. 2016. On the Size of Pairing-Based Non-interactive Arguments. In Advances in Cryptology - EUROCRYPT 2016 - 35th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Vienna, Austria, May 8--12, 2016, Proceedings, Part II. 305--326.
[26]
Joe Kilian. 1992. A Note on Efficient Zero-Knowledge Proofs and Arguments (Extended Abstract). In Proceedings of the ACM Symposium on Theory of Computing .
[27]
Vadim Lyubashevsky, Daniele Micciancio, Chris Peikert, and Alon Rosen. [n.d.]. SWIFFT: A Modest Proposal for FFT Hashing. In Fast Software Encryption . Springer Berlin Heidelberg, 54--72. https://doi.org/10.1007/978--3--540--71039--4_4
[28]
Ralph C. Merkle. [n.d.]. A Certified Digital Signature. In CRYPTO 1989. 218--238.
[29]
Silvio Micali. 2000. Computationally Sound Proofs. SIAM J. Comput. (2000).
[30]
Bryan Parno, Jon Howell, Craig Gentry, and Mariana Raykova. 2013. Pinocchio: Nearly practical verifiable computation. In S&P 2013 . 238--252.
[31]
Jacob T Schwartz. 1979. Probabilistic algorithms for verification of polynomial identities. In International Symposium on Symbolic and Algebraic Manipulation. Springer, 200--215.
[32]
Roberto Tamassia. 2003. Authenticated data structures. In European symposium on algorithms. Springer, 2--5.
[33]
Anselme Tueno, Florian Kerschbaum, and Stefan Katzenbeisser. 2019. Private evaluation of decision trees using sublinear cost. Proceedings on Privacy Enhancing Technologies, Vol. 2019, 1 (2019), 266--286.
[34]
Jaideep Vaidya and Chris Clifton. 2005. Privacy-preserving decision trees over vertically partitioned data. In IFIP Annual Conference on Data and Applications Security and Privacy. Springer, 139--152.
[35]
Riad S Wahby, Srinath TV Setty, Zuocheng Ren, Andrew J Blumberg, and Michael Walfish. 2015. Efficient RAM and control flow in verifiable outsourced computation. In NDSS .
[36]
Riad S Wahby, Ioanna Tzialla, Abhi Shelat, Justin Thaler, and Michael Walfish. 2018. Doubly-efficient zkSNARKs without trusted setup. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 926--943.
[37]
Tiacheng Xie, Jiaheng Zhang, Yupeng Zhang, Charalampos Papamanthou, and Dawn Song. 2019. Libra: Succinct Zero-Knowledge Proofs with Optimal Prover Computation. In Advances in Cryptology (CRYPTO) .
[38]
Jiaheng Zhang, Tiancheng Xie, Yupeng Zhang, and Dawn Song. [n.d.]. Transparent Polynomial Delegation and Its Applications to Zero Knowledge Proof. In S&P 2020 .
[39]
Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos, and Charalampos Papamanthou. 2017a. vSQL: Verifying arbitrary SQL queries over dynamic outsourced databases. In Security and Privacy (SP), 2017 IEEE Symposium on. IEEE, 863--880.
[40]
Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos, and Charalampos Papamanthou. 2017b. A Zero-Knowledge Version of vSQL . Cryptology ePrint.
[41]
Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos, and Charalampos Papamanthou. 2018. vRAM: Faster verifiable RAM with program-independent preprocessing. In Proceeding of IEEE Symposium on Security and Privacy (S&P) .
[42]
Lingchen Zhao, Qian Wang, Cong Wang, Qi Li, Chao Shen, Xiaodong Lin, Shengshan Hu, and Minxin Du. 2019. VeriML: Enabling Integrity Assurances and Fair Payments for Machine Learning as a Service. arXiv preprint arXiv:1909.06961 (2019).
[43]
Richard Zippel. 1979. Probabilistic algorithms for sparse polynomials. In International Symposium on Symbolic and Algebraic Manipulation. Springer, 216--226.

Cited By

View all
  • (2024)Decision Tree-Based Federated Learning: A SurveyBlockchains10.3390/blockchains20100032:1(40-60)Online publication date: 7-Mar-2024
  • (2024)Sparrow: Space-Efficient zkSNARK for Data-Parallel Circuits and Applications to Zero-Knowledge Decision TreesProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690318(3110-3124)Online publication date: 2-Dec-2024
  • (2024)zkLLM: Zero Knowledge Proofs for Large Language ModelsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670334(4405-4419)Online publication date: 2-Dec-2024
  • Show More Cited By

Index Terms

  1. Zero Knowledge Proofs for Decision Tree Predictions and Accuracy

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security
    October 2020
    2180 pages
    ISBN:9781450370899
    DOI:10.1145/3372297
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 November 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. decision tree
    2. machine learning
    3. zero knowledge proofs

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CCS '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2,183
    • Downloads (Last 6 weeks)186
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Decision Tree-Based Federated Learning: A SurveyBlockchains10.3390/blockchains20100032:1(40-60)Online publication date: 7-Mar-2024
    • (2024)Sparrow: Space-Efficient zkSNARK for Data-Parallel Circuits and Applications to Zero-Knowledge Decision TreesProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690318(3110-3124)Online publication date: 2-Dec-2024
    • (2024)zkLLM: Zero Knowledge Proofs for Large Language ModelsProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670334(4405-4419)Online publication date: 2-Dec-2024
    • (2024)SZKP: A Scalable Accelerator Architecture for Zero-Knowledge ProofsProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676898(271-283)Online publication date: 14-Oct-2024
    • (2024)MSMAC: Accelerating Multi-Scalar Multiplication for Zero-Knowledge ProofProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3655672(1-6)Online publication date: 23-Jun-2024
    • (2024)A Generative Framework for Low-Cost Result Validation of Machine Learning-as-a-Service InferenceProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3657015(1246-1260)Online publication date: 1-Jul-2024
    • (2024)Cryptographic Primitives in Privacy-Preserving Machine Learning: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.332180336:5(1919-1934)Online publication date: May-2024
    • (2024)ValidCNN: A Large-Scale CNN Predictive Integrity Verification Scheme Based on zk-SNARKIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.337164321:6(5185-5195)Online publication date: Nov-2024
    • (2024)A Verifiable and Privacy-Preserving Federated Learning Training FrameworkIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.3369658(1-14)Online publication date: 2024
    • (2024)Data Protection: Privacy-Preserving Data Collection With ValidationIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.332629921:4(3422-3438)Online publication date: Jul-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media