Abstract
In this paper, we present a secure method of computing Pearson correction coefficients while preserving data privacy as well as data quality in the distributed computing environment. In general data analytical/mining processes, individual data owners need to provide their original data to the third parties. In many cases, however, the original data contain sensitive information, and the data owners do not want to disclose their data in the original form for the purpose of privacy preservation. In this paper, we address a problem of secure multiparty computation of Pearson correlation coefficients. For the secure Pearson correlation computation, we first propose an advanced solution by exploiting the secure scalar product. We then present an approximate solution by adopting the lower-dimensional transformation. We finally empirically show that the proposed solutions are practical methods in terms of execution time and data quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aggarwal, C.C., Yu, P.S.: Privacy-preserving data mining: a survey. In: Gertz, M., Jajodia, S. (eds.) Handbook of Database Security, pp. 431–460. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-48533-1_18
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of International Conference on Management of Data, ACM SIGMOD, Dallas, TX, pp. 439–450, June 2000
Blaikie, N.: Analyzing Quantitative Data. Sage Publications, London (2003)
Du, W., Atallah, M.J.: Secure multi-party computation problems and their applications - a review and open problems. In: Proceedings of the 2001 Workshop on New Security Paradigms, New York, NY, pp. 13–22, September 2001
Goethals, B., Laur, S., Lipmaa, H., Mielikäinen, T.: On private scalar product computation for privacy-preserving data mining. In: Proceedings of the 7th International Conference on Information Security and Cryptology, Seoul, Korea, pp. 104–120, December 2004
Jiang, W., Murugesan, M., Clifton, C., Si, L.: Similar document detection with limited information disclosure. In: Proceedings of the 24th International Conference on Data Engineering, Cancun, pp. 735–743, April 2008
Kaosar, M.G., Paulet, R., Yi, X.: Fully homomorphic encryption based two-party association rule mining. Data Knowl. Eng. 76–78, 1–15 (2012)
Kim, S.-P., Gil, M.-S., Kim, H., Choi, M.-J., Moon, Y.-S., Won, H.-S.: Efficient two-step protocol and its discriminative feature selections in secure similar document detection. Secur. Commun. Netw. 2017, Article ID 6841216, 1–12 (2017)
Lee, M., Lee, S., Choi, M.-J., Moon, Y.-S., Lim, H.-S.: HybridFTW: hybrid computation of dynamic time warping distances. IEEE Access 6, 2085–2096 (2018)
Lee, S., Kim, B.-S., Choi, M.-J., Moon, Y.-S.: Coefficient control multi-step \(k\)-NN search in time-series databases. Int. J. Innov. Comput. Inf. Control 12(2), 419–431 (2016)
Moon, Y.-S., Kim, H.-S., Kim, S.-P., Bertino, E.: Publishing time-series data under preservation of privacy and distance orders. In: Proceedings of the 21st International Conference on Database and Expert Systems Application, Bilbao, Spain, pp. 17–31, August 2010
National Climate Data Center. http://www.ncdc.noaa.gov
Sayal, M., Singh, L.: Privately detecting pairwise correlations in distributed time series. In: Proceedings of IEEE International Conference on Privacy, Security, Risk, and Trust and IEEE International Conference on Social Computing, Boston, MA, pp. 981–987, October 2011
Won, H.-S., Kim, S.-P., Lee, S., Choi, M.-J., Moon, Y.-S.: Secure principal component analysis in multiple distributed nodes. Secur. Commun. Netw. 9(14), 2348–2358 (2016)
Yao, A.C.: Protocols for secure computations. In: Proceedings of the 23th IEEE Symposium on Foundations of Computer Science, Chicago, IL, pp. 160–164, November 1982
Yi, X., Kaosar, M.G., Paulet, R., Bertino, E.: Single-database private information retrieval from fully homomorphic encryption. IEEE Trans. Knowl. Data Eng. 25(5), 1125–1134 (2013)
Acknowledgment
This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2016-0-00179, Development of an Intelligent Sampling and Filtering Techniques for Purifying Data Streams).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Hong, SK., Gil, MS., Moon, YS. (2018). Secure Computation of Pearson Correlation Coefficients for High-Quality Data Analytics. In: Liu, C., Zou, L., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10829. Springer, Cham. https://doi.org/10.1007/978-3-319-91455-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-91455-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91454-1
Online ISBN: 978-3-319-91455-8
eBook Packages: Computer ScienceComputer Science (R0)