[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3338466.3358926acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

PrivFL: Practical Privacy-preserving Federated Regressions on High-dimensional Data over Mobile Networks

Published: 11 November 2019 Publication History

Abstract

Federated Learning (FL) enables a large number of users to jointly learn a shared machine learning (ML) model, coordinated by a centralized server, where the data is distributed across multiple devices. This approach enables the server or users to train and learn an ML model using gradient descent, while keeping all the training data on users' devices. We consider training an ML model over a mobile network where user dropout is a common phenomenon. Although federated learning was aimed at reducing data privacy risks, the ML model privacy has not received much attention.
In this work, we present PrivFL, a privacy-preserving system for training (predictive) linear and logistic regression models and oblivious predictions in the federated setting, while guaranteeing data and model privacy as well as ensuring robustness to users dropping out in the network. We design two privacy-preserving protocols for training linear and logistic regression models based on an additive homomorphic encryption (HE) scheme and an aggregation protocol. Exploiting the training algorithm of federated learning, at the core of our training protocols is a secure multiparty global gradient computation on alive users' data. We analyze the security of our training protocols against semi-honest adversaries. As long as the aggregation protocol is secure under the aggregation privacy game and the additive HE scheme is semantically secure, PrivFL guarantees the users' data privacy against the server, and the server's regression model privacy against the users. We demonstrate the performance of PrivFL on real-world datasets and show its applicability in the federated learning system.

References

[1]
Ai.type. https://www.androidauthority.com/ai-type-data-exposed-820431/.
[2]
Aono, Y., Hayashi, T., Trieu Phong, L., and Wang, L. Scalable and secure logistic regression via homomorphic encryption. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy(2016), ACM, pp. 142--144.
[3]
Auto MPG data set.https://archive.ics.uci.edu/ml/datasets/auto+mpg, 1993.Online; accessed 29 July 2019.
[4]
Barbosa, M., Catalano, D., and Fiore, D. Labeled homomorphic encryption. In Computer Security -- ESORICS 2017(Cham, 2017), S. N. Foley, D. Gollmann, and E. Snekkenes, Eds., Springer International Publishing, pp. 146--166.
[5]
Bogdanov, D., Kamm, L., Laur, S., and Sokk, V. Rmind: A tool for cryptograph-ically secure statistical analysis. IEEE Transactions on Dependable and Secure Computing 15, 3 (May 2018), 481--495.
[6]
Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Ingerman, A., Ivanov,V., Kiddon, C., Konecný, J., Mazzocchi, S., McMahan, H. B., Overveldt, T. V.,Petrou, D., Ramage, D., and Roselander, J.Towards federated learning atscale: System design. CoRR abs/1902.01046(2019).
[7]
Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H. B., Patel,S., Ramage, D., Segal, A., and Seth, K. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (New York, NY, USA, 2017), CCS '17, ACM, pp. 1175--1191.
[8]
Bonte, C., and Vercauteren, F. Privacy-preserving logistic regression training. Tech. rep., IACR Cryptology ePrint Archive 233, 2018.
[9]
Boston Housing Dataset.https://archive.ics.uci.edu/ml/machine-learning-databases/housing/, 2019. Online; accessed 29 July 2019.
[10]
Buescher, N., Boukoros, S., Bauregger, S., and Katzenbeisser, S. Two is not enough: Privacy assessment of aggregation schemes in smart metering. Proceedings on Privacy Enhancing Technologies 2017, 4 (2017), 198--214.
[11]
Census Income Data Set.https://archive.ics.uci.edu/ml/datasets/census+income, 1996. Online; accessed 29 July 2019.
[12]
Corrigan-Gibbs, H., and Boneh, D.Prio: Private, robust, and scalable computation of aggregate statistics. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (Berkeley, CA, USA, 2017), NSDI'17, USENIX Association, pp. 259--282.
[13]
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., and Reis, J. Modeling wine preferences by data mining from physico chemical properties. Decision Support Systems 47, 4 (2009), 547--553.
[14]
Diabetes Data Set. https://archive.ics.uci.edu/ml/datasets/diabetes, 1994.
[15]
Diffie, W., and Hellman, M. New directions in cryptography. IEEE Trans. Inf. Theor. 22, 6 (Sept. 2006), 644--654.
[16]
Du, W., Han, Y. S., and Chen, S.Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification. pp. 222--233.
[17]
Dua, D., and Graff, C. UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) and https://archive.ics.uci.edu/ml/datasets/credit+approval, 2017.
[18]
Dwork, C., and Roth, A.The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 3?4 (Aug. 2014), 211--407.
[19]
Fanaee-T, H., and Gama, J. Event labeling combining ensemble detectors and background knowledge. Progress in Artificial Intelligence(2013), 1--15.
[20]
Fioretto, F., and Van Hentenryck, P. Privacy-preserving federated data sharing. In Proceedings of the 18th International Conference on Autonomous Agents and Multi Agent Systems (Richland, SC, 2019), AAMAS '19, International Foundation for Autonomous Agents and Multi agent Systems, pp. 638--646.
[21]
Fredrikson, M., Jha, S., and Ristenpart, T. Model inversion attacks that exploit confidence information and basic counter measures. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (New York, NY, USA, 2015), CCS '15, ACM, pp. 1322--1333.
[22]
Gascón, A., Schoppmann, P., Balle, B., Raykova, M., Doerner, J., Zahur, S., and Evans, D. Privacy-preserving distributed linear regression on high-dimensional data. Proceedings on Privacy Enhancing Technologies 2017, 4 (2017), 345--364.
[23]
Gascón, A., Schoppmann, P., Balle, B., Raykova, M., Doerner, J., Zahur, S., and Evans, D. Privacy-preserving distributed linear regression on high-dimensional data. Proceedings on Privacy Enhancing Technologies 2017, 4 (2017), 345--364.
[24]
Gentry, C. Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing (New York, NY, USA, 2009), STOC '09, ACM, pp. 169--178.
[25]
Goethals, B., Laur, S., Lipmaa, H., and Mielikäinen, T. On private scalar product computation for privacy-preserving data mining. In Information Security and Cryptology -- ICISC 2004 (Berlin, Heidelberg, 2005), C.-s. Park and S. Chee, Eds., Springer Berlin Heidelberg, pp. 104--120.
[26]
Graepel, T., Lauter, K., and Naehrig, M. Ml confidential: Machine learning on encrypted data. In Information Security and Cryptology -- ICISC 2012(Berlin,Heidelberg, 2013), Springer Berlin Heidelberg, pp. 1--21.
[27]
Granlund, T., et al. GMP: the GNU multiple precision arithmetic library, 1991.
[28]
Hall, R., Fienberg, S. E., and Nardi, Y.Secure multiple linear regression based on homomorphic encryption. Journal of Official Statistics 27, 4 (2011), 669.
[29]
Hardy, S., Henecka, W., Ivey-Law, H., Nock, R., Patrini, G., Smith, G., and Thorne, B. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. CoRR abs/1711.10677(2017).
[30]
Hart, W., Johansson, F., and Pancratz, S. FLINT: Fast Library for Number Theory, 2013. Version 2.4.0, http://flintlib.org.
[31]
Joye, M., and Libert, B. Efficient crypto systems from 2k-th power residue symbols. In Advances in Cryptology -- EUROCRYPT 2013(Berlin, Heidelberg,2013), Springer Berlin Heidelberg, pp. 76--92.
[32]
Karr, A. F., Lin, X., Sanil, A. P., and Reiter, J. P. Regression on distributed databases via secure multi-party computation. In Proceedings of the 2004 Annual National Conference on Digital Government Research (2004), dg.o '04, Digital Government Society of North America, pp. 108:1--108:2.
[33]
Karr, A. F., Lin, X., Sanil, A. P., and Reiter, J. P.Secure regression on distributed data bases. Journal of Computational and Graphical Statistics 14, 2 (2005), 263--279.
[34]
Karr, A. F., Lin, X., Sanil, A. P., and Reiter, J. P. Privacy-preserving analysis of vertically partitioned data using secure matrix products. J. Official Statistics(2009).
[35]
Kiltz, E., Leander, G., and Malone-Lee, J. Secure computation of the mean and related statistics. In Theory of Cryptography (Berlin, Heidelberg, 2005), J. Kilian, Ed., Springer Berlin Heidelberg, pp. 283--302.
[36]
Kim, A., Song, Y., Kim, M., Lee, K., and Cheon, J. H. Logistic regression model training based on the approximate homomorphic encryption. BMC Medical Genomics 11, 4 (Oct 2018), 83.
[37]
Kim, M., Song, Y., Wang, S., Xia, Y., and Jiang, X.Secure logistic regression based on homomorphic encryption: Design and evaluation. JMIR medical informatics 6,2 (2018).
[38]
Kohavi, R.Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining(1996), KDD'96, AAAI Press, pp. 202--207.
[39]
Kone-Ăná, J., McMahan, H. B., Yu, F. X., Richtarik, P., Suresh, A. T., and Bacon, D.Federated learning: Strategies for improving communication efficiency. In NIPS Workshop on Private Multi-Party Machine Learning(2016).
[40]
Liu, J., Juuti, M., Lu, Y., and Asokan, N. Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security(New York, NY, USA, 2017), CCS '17, ACM, pp. 619--631.
[41]
Liu, Y., Chen, T., and Yang, Q. Secure federated transfer learning. CoRRabs/1812.03337(2018).
[42]
Mandal, K., and Gong, G. Priv FL: Practical privacy-preserving federated regressions on high-dimensional data over mobile networks. Cryptology ePrint Archive, Report 2019/979, 2019. https://eprint.iacr.org/2019/979.
[43]
Mandal, K., Gong, G., and Liu, C.Nike-based fast privacy-preserving high-dimensional data aggregation for mobile devices. CACR Technical Report, CACR2018--10, University of Waterloo, Canada, 2018.
[44]
McMahan, H. B., Moore, E., Ramage, D., Hampson, S., et al. Communication-efficient learning of deep networks from decentralized data. arXiv preprintar Xiv:1602.05629(2016).
[45]
McMahan, H. B., Moore, E., Ramage, D., and y Arcas, B. A. Federated learning of deep networks using model averaging. CoRR abs/1602.05629(2016).
[46]
Mohassel, P., and Rindal, P. Aby 3: A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security(New York, NY, USA, 2018), CCS '18, ACM, pp. 35--52.
[47]
Mohassel, P., and Zhang, Y. Secureml: A system for scalable privacy-preserving machine learning. Cryptology ePrint Archive, Report 2017/396, 2017. http://eprint.iacr.org/2017/396.
[48]
Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., and Taft, N. Privacy-preserving ridge regression on hundreds of millions of records. In Proceedings of the 2013 IEEE Symposium on Security and Privacy (Washington,DC, USA, 2013), SP '13, IEEE Computer Society, pp. 334--348.
[49]
Open SSL.The openssl library. https://www.openssl.org/.
[50]
Paillier, P. Public-key cryptosystems based on composite degree residuosity classes. In Proceedings of the 17th International Conference on Theory and Application of Cryptographic Techniques(Berlin, Heidelberg, 1999), EUROCRYPT'99, Springer-Verlag, pp. 223--238.
[51]
Sanil, A. P., Karr, A. F., Lin, X., and Reiter, J. P. Privacy preserving regression modelling via distributed computation. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(New York,NY, USA, 2004), KDD '04, ACM, pp. 677--682.
[52]
Shokri, R., and Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (New York, NY, USA, 2015), CCS '15, ACM, pp. 1310--1321.
[53]
Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., and Ristenpart, T. Stealing machine learning models via prediction apis. In 25th USENIX Security Symposium (USENIX Security 16)(Austin, TX, Aug. 2016), USENIX Association, pp. 601--618.
[54]
Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., and Zhang, R. Ahybrid approach to privacy-preserving federated learning. CoRR abs/1812.03224(2018).
[55]
Tsanas, A., Little, M. A., McSharry, P. E., and Ramig, L. O. Accurate tele-monitoring of parkinson's disease progression by noninvasive speech tests. IEEE Transactions on Biomedical Engineering 57, 4 (April 2010), 884--893.
[56]
Tsanas, A., and Xifara, A. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings 49(2012), 560--567.
[57]
Wenliang Du, and Atallah, M. J.Privacy-preserving cooperative statistical analysis. In Seventeenth Annual Computer Security Applications Conference(Dec2001), pp. 102--110.
[58]
Yao, A. C.-C.How to generate and exchange secrets. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science (Washington, DC, USA, 1986), SFCS '86, IEEE Computer Society, pp. 162--167.
[59]
Yeh, I.-C., and hui Lien, C.The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications 36, 2, Part 1 (2009), 2473--2480.
[60]
Zhu, X. D., Li, H., and Li, F. H. Privacy-preserving logistic regression outsourcing in cloud computing. Int. J. Grid Util. Comput. 4, 2/3 (Sept. 2013), 144--150.

Cited By

View all
  • (2024)RSAM: Byzantine-Robust and Secure Model Aggregation in Federated Learning for Internet of Vehicles Using Private Approximate MedianIEEE Transactions on Vehicular Technology10.1109/TVT.2023.334163773:5(6714-6726)Online publication date: May-2024
  • (2024)Practical and Robust Federated Learning With Highly Scalable Regression TrainingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327185935:10(13801-13815)Online publication date: Oct-2024
  • (2024)Robust and Privacy-Preserving Decentralized Deep Federated Learning Training: Focusing on Digital Healthcare ApplicationsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2023.324393221:4(890-901)Online publication date: Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCSW'19: Proceedings of the 2019 ACM SIGSAC Conference on Cloud Computing Security Workshop
November 2019
209 pages
ISBN:9781450368261
DOI:10.1145/3338466
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. federated learning
  2. machine learning
  3. predictive analysis
  4. privacy-preserving computation

Qualifiers

  • Research-article

Funding Sources

  • NSERC Discovery Grants

Conference

CCS '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 37 of 108 submissions, 34%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)17
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)RSAM: Byzantine-Robust and Secure Model Aggregation in Federated Learning for Internet of Vehicles Using Private Approximate MedianIEEE Transactions on Vehicular Technology10.1109/TVT.2023.334163773:5(6714-6726)Online publication date: May-2024
  • (2024)Practical and Robust Federated Learning With Highly Scalable Regression TrainingIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.327185935:10(13801-13815)Online publication date: Oct-2024
  • (2024)Robust and Privacy-Preserving Decentralized Deep Federated Learning Training: Focusing on Digital Healthcare ApplicationsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2023.324393221:4(890-901)Online publication date: Jul-2024
  • (2024)Client-Side Gradient Inversion Attack in Federated Learning Using Secure AggregationIEEE Internet of Things Journal10.1109/JIOT.2024.340593911:17(28774-28786)Online publication date: 1-Sep-2024
  • (2024)Efficiency Optimization Techniques in Privacy-Preserving Federated Learning With Homomorphic Encryption: A Brief SurveyIEEE Internet of Things Journal10.1109/JIOT.2024.338287511:14(24569-24580)Online publication date: 15-Jul-2024
  • (2024)A hybrid federated kernel regularized least squares algorithmKnowledge-Based Systems10.1016/j.knosys.2024.112600305(112600)Online publication date: Dec-2024
  • (2024)VPPLR: Privacy-preserving logistic regression on vertically partitioned data using vectorization sharingJournal of Information Security and Applications10.1016/j.jisa.2024.10372582(103725)Online publication date: May-2024
  • (2024)Model aggregation techniques in federated learning: A comprehensive surveyFuture Generation Computer Systems10.1016/j.future.2023.09.008150(272-293)Online publication date: Jan-2024
  • (2024)Privacy-preserving multi-party logistic regression in cloud computingComputer Standards & Interfaces10.1016/j.csi.2024.103857(103857)Online publication date: Apr-2024
  • (2024)EPFL-DAC: Enhancing Privacy in Federated Learning with Dynamic Aggregation and ClippingComputers & Security10.1016/j.cose.2024.103911143(103911)Online publication date: Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media