Abstract
In the big data era, data scientists explore machine learning methods for observed data to predict or classify. For machine learining to be effective, it requires access to raw data which is often privacy sensitive. In addition, whatever data and fitting procedures are employed, a crucial step is to select the most appropriate model from the given dataset. Model selection is a key ingredient in data analysis for reliable and reproducible statistical inference or prediction. To address this issue, we develop new techniques to provide solutions for running model selection over encrypted data. Our approach provides the best approximation of the relationship between the dependent and independent variable through cross validation. After performing 4-fold cross validation, 4 different estimates of our model’s errors are calculated. And then we use bias and variance extracted from these errors to find the best model. We perform an experiment on a dataset extracted from Kaggle and show that our approach can homomorphically regress a given encrypted data without decrypting it.
This research is supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the IITP support program (2017-0-00545).
We thank Joonsoo Yoo and Jeonghwan Hwang for their assistance in this research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chillotti, I., Gama, N., Georgieva, M., Izabachène, M.: Faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 3–33. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-53887-6_1
Chillotti, I., Gama, N., Georgieva, M., Izabachene, M.: Improving TFHE: faster packed homomorphic operations and e cient circuit bootstrapping. Cryptology ePrint Archive (2017)
Rivest, R.L., Adleman, L., Dertouzos, M.L.: On data banks and privacy homomorphisms. Found. Sec. Comput. 4(11), 169–180 (1978)
Gentry, C., et al.: Fully homomorphic encryption using ideal lattices, vol. 9, pp. 169–178 (2009)
TFHE. https://tfhe.github.io/tfhe/. Accessed 15 Aug 2017
DasGupta, D.: In-place matrix inversion by modified Gauss-Jordan algorithm. Appl. Math. 4(10), 1392–1396 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hong, M.Y., Yoon, J.W. (2020). Model Selection for Data Analysis in Encrypted Domain: Application to Simple Linear Regression. In: You, I. (eds) Information Security Applications. WISA 2019. Lecture Notes in Computer Science(), vol 11897. Springer, Cham. https://doi.org/10.1007/978-3-030-39303-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-39303-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39302-1
Online ISBN: 978-3-030-39303-8
eBook Packages: Computer ScienceComputer Science (R0)