Abstract
We introduce low complexity bounds on mutual information for efficient privacy-preserving feature selection with secure multi-party computation (MPC). Considering a discrete feature with N possible values and a discrete label with M possible values, our approach requires O(N) multiplications as opposed to O(NM) in a direct MPC implementation of mutual information. Our experimental results show that for regression tasks, we achieve a computation speed up of over 1,000\(\times \) compared to a straightforward MPC implementation of mutual information, while achieving similar accuracy for the downstream machine learning model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akavia, A., Galili, B., Shaul, H., Weiss, M., Yakhini, Z.: Privacy preserving feature selection for sparse linear regression. Cryptology ePrint Archive, Paper 2023/1354 (2023). https://eprint.iacr.org/2023/1354
Alishahi, M., Moghtadaiee, V.: Feature selection on anonymized datasets. In: IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC) (2023)
Alishahi, M., Moghtadaiee, V., Navidan, H.: Add noise to remove noise: local differential privacy for feature selection. Comput. Secur. (2022)
Aono, Y., Hayashi, T., Wang, L., Moriai, S., et al.: Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. (2017)
Araki, T., Furukawa, J., Lindell, Y., Nof, A., Ohara, K.: High-throughput semi-honest secure three-party computation with an honest majority. In: Proceedings of the 2016 ACM Conference on Computer and Communications Security (2016)
Banerjee, M., Chakravarty, S.: Privacy preserving feature selection for distributed data using virtual dimension. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management (2011)
Banerjee, S., Elmroth, E., Bhuyan, M.: Fed-FiS: a novel information-theoretic federated feature selection for learning stability. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds.) ICONIP 2021. CCIS, vol. 1516, pp. 480–487. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-92307-5_56
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. (1994)
Bogdanov, D., Laur, S., Willemson, J.: Sharemind: a framework for fast privacy-preserving computations. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 192–206. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88313-5_13
Cassara, P., Gotta, A., Valerio, L.: Federated feature selection for cyber-physical systems of systems. IEEE Trans. Veh. Technol. (2022)
Catrina, O., de Hoogh, S.: Improved primitives for secure multiparty integer computation. In: Garay, J.A., De Prisco, R. (eds.) SCN 2010. LNCS, vol. 6280, pp. 182–199. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15317-4_13
Catrina, O., Saxena, A.: Secure computation with fixed-point numbers. In: Sion, R. (ed.) FC 2010. LNCS, vol. 6052, pp. 35–50. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14577-3_6
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. (2014)
Covert, I.C., Qiu, W., Lu, M., Kim, N.Y., White, N.J., Lee, S.-I.: Learning to maximize mutual information for dynamic feature selection. In: International Conference on Machine Learning. PMLR (2023)
Cramer, R., Damgård, I.B., et al.: Secure Multiparty Computation. Cambridge University Press (2015)
Dankar, F.K., Madathil, N., Dankar, S.K., Boughorbel, S.: Privacy-preserving analysis of distributed biomedical data: designing efficient and secure multiparty computations using distributed statistical learning theory. JMIR Med. Inf. (2019)
Evans, D., Kolesnikov, V., Rosulek, M., et al.: A pragmatic introduction to secure multi-party computation. Found. Trends® Priv. Secur. (2018)
Froelicher, D., et al.: Scalable and privacy-preserving federated principal component analysis. In: IEEE Symposium on Security and Privacy (SP) (2023)
Froelicher, D., et al.: Scalable privacy-preserving distributed learning. In: Proceedings on Privacy Enhancing Technologies (2021)
Fu, R., Wu, Y., Xu, Q., Zhang, M.: FEAST: a communication-efficient federated feature selection framework for relational data. Proc. ACM Manag. Data (2023)
Gascón, A.: Privacy-preserving distributed linear regression on high-dimensional data. Proc. Priv. Enhancing Technol. (2017)
Goldreich, O., Micali, S., Wigderson, A.: How to play any mental game, or a completeness theorem for protocols with honest majority, pp. 307–328. Association for Computing Machinery (2019)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. (2003)
Hu, Y., Zhang, Y., Gong, D., Sun, X.: Multi-participant federated feature selection algorithm with particle swarm optimizaiton for imbalanced data under privacy protection. IEEE Trans. Artif. Intell. (2022)
Jafer, Y., Matwin, S., Sokolova, M.: A framework for a privacy-aware feature selection evaluation measure. In: 13th Annual Conference on Privacy, Security and Trust (PST). IEEE (2015)
Kamm, L., Bogdanov, D., Laur, S., Vilo, J.: A new way to protect privacy in large-scale genome-wide association studies. Bioinformatics (2013)
Karhu, T.: Desaturation event scoring criteria affect the perceived severity of nocturnal hypoxic load. Sleep Med. 100, 479–486 (2022)
Karhu, T., Leppänen, T., Töyräs, J., Oksenberg, A., Myllymaa, S., Nikkonen, S.: ABOSA - freely available automatic blood oxygen saturation signal analysis software: structure and validation. Comput. Methods Programs Biomed. (2022)
Karhu, T., Myllymaa, S., Nikkonen, S., Mazzotti, D.R., Töyräs, J., Leppänen, T.: Longer and deeper desaturations are associated with the worsening of mild sleep apnea: the sleep heart health study. Front. Neurosci. 15, 657126 (2021)
Keller, M.: MP-SPDZ: a versatile framework for multi-party computation. In: Proceedings of the ACM Conference on Computer and Communications Security (2020)
Klosh, G., et al.: The siesta project polygraphic and clinical database. IEEE Eng. Med. Biol. Mag. (2001)
Knott, B., Venkataraman, S., Hannun, A., Sengupta, S., Ibrahim, M., van der Maaten, L.: CrypTen: secure multi-party computation meets machine learning. In: Advances in Neural Information Processing Systems (2021)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. (1997)
Küsters, R., Liedtke, J., Müller, J., Rausch, D., Vogt, A.: Ordinos: a verifiable tally-hiding e-voting system. In: IEEE European Symposium on Security and Privacy (EuroS &P) (2020)
Langley, P., et al.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance (1994)
Law, M.H., Figueiredo, M.A., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. (2004)
Li, A., et al.: Efficient and privacy-preserving feature importance-based vertical federated learning. IEEE Trans. Mob. Comput. (2023)
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. (CSUR) (2017)
Li, X., Dowsley, R., De Cock, M.: Privacy-preserving feature selection with secure multiparty computation. In: International Conference on Machine Learning. PMLR (2021)
Majithia, J., Levan, D.: A note on base-2 logarithm computations. Proc. IEEE 61(10), 1519–1520 (1973)
Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. (2002)
Nikolaenko, V., Weinsberg, U., Ioannidis, S., Joye, M., Boneh, D., Taft, N.: Privacy-preserving ridge regression on hundreds of millions of records. In: IEEE Symposium on Security and Privacy (SP) (2013)
Ono, S., Takata, J., Kataoka, M., Shin, T.I.K., Sakamoto, H.: Privacy-preserving feature selection with fully homomorphic encryption. Algorithms (2022)
Polychroniadou, A., et al.: Prime match: a privacy-preserving inventory matching system. In: 32nd USENIX Security Symposium (2023)
Qin, Y., Kondo, M.: Federated learning-based network intrusion detection with a feature selection approach. In: IEEE International Conference on Electrical, Communication, and Computer Engineering (2021)
Raab, R.: Federated electronic health records for the European health data space. Lancet Digit. Health (2023)
Rao, V., Long, Y., Eldardiry, H., Rane, S., Rossi, R., Torres, F.: Secure two-party feature selection. arXiv preprint arXiv:1901.00832 (2019)
Romero, E., Sopena, J.M.: Performing feature selection with multilayer perceptrons. IEEE Trans. Neural Netw. (2008)
Russell Graves-Morris, P., Saff, E.B., Varga, R.S.: Rational Approximation and Interpolation. Lecture Notes in Mathematics. Springer, Heidelberg (1983). https://doi.org/10.1007/BFb0072395
Sangers, A., et al.: Secure multiparty pagerank algorithm for collaborative fraud detection. In: Goldberg, I., Moore, T. (eds.) FC 2019. LNCS, vol. 11598, pp. 605–623. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32101-7_35
Sav, S., Diaa, A., Pyrgelis, A., Bossuat, J.-P., Hubaux, J.-P.: Privacy-preserving federated recurrent neural networks. In: Proceedings on Privacy Enhancing Technologies (2023)
Sav, S., et al.: POSEIDON: privacy-preserving federated neural network learning. In: 28th Network And Distributed System Security Symposium (NDSS). Internet Society (2021)
Sheikhalishahi, M., Martinelli, F.: Privacy-utility feature selection as a privacy mechanism in collaborative data classification. In: IEEE International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (2017)
Stracuzzi, D.J., Utgoff, P.E.: Randomized variable elimination. J. Mach. Learn. Res. (2004)
Volder, J.: The CORDIC computing technique. In: International Workshop on Managing Requirements Knowledge. IEEE Computer Society (1959)
Wagh, S., Gupta, D., Chandran, N.: SecureNN: 3-party secure computation for neural network training. In: Proceedings on Privacy Enhancing Technologies (2019)
Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. IEEE Intell. Syst. Their Appl. (1998)
Yao, A.C.: Protocols for secure computations. In: 23rd Annual Symposium on Foundations of Computer Science. IEEE (1982)
Yao, X., Huang, T., Wu, C., Zhang, R., Sun, L.: Towards faster and better federated learning: a feature fusion approach. In: IEEE International Conference on Image Processing (ICIP) (2019)
Zhang, Q., Wang, C., Wu, H., Xin, C., Phuong, T.V.: GELU-Net: a globally encrypted, locally unencrypted deep neural network for privacy-preserved learning. In: International Joint Conferences on Artificial Intelligence (2018)
Zhang, R., Li, H., Hao, M., Chen, H., Zhang, Y.: Secure feature selection for vertical federated learning in eHealth systems. In: IEEE International Conference on Communications (2022)
Zhang, X., Mavromatics, A., Vafeas, A., Nejabati, R., Simeonidou, D.: Federated feature selection for horizontal federated learning in IoT networks. IEEE Internet Things (2023)
Acknowledgments
This research is funded by the EU Horizon Europe project HARPOCRATES (Grant ID. 101069535) and H2020 project CONCORDIA (Grant ID. 830927). We thank Tuomas Karhu for preparing the SpO2 data as well as help and advice in the process. We would also like to thank the anonymous reviewers for their comments and suggested improvements.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
A Appendix: Cutoff Probability
With x, \(x_1,...,x_k\) and \(\lambda _1,\dots ,\lambda _k\) as in Sect. 5.1, we shall see that we can modify the bound \(H(x) \le \sum _j \lambda _j C(x,x_j)\) by cutting off probabilities in \(C(x,x_j)\) at a given level \(0 < \epsilon \le \min _j(e^{-1/\lambda _j})\). More in detail, \(H(x) \le \sum _j \lambda _j \bar{C}(x,x_j)\) where \(\bar{C}(x,x_j)=\sum _i - \bar{p}_i(x) \log {\bar{p}_i(x_j)}\), \(\bar{p}_i(x_j)=\max (p_i(x_j),\epsilon )\) and \(\bar{p}_i(x)=\sum _j \lambda _j \bar{p}_i(x_j)\).
Note that \(f(z)=-z\log {z}\), for \(0 < z \le 1\), increases up to \(z=e^{-1}\) where it has its maximum \(f(e^{-1})=e^{-1}\) and decreases after that. Let \(g(z_1,\dots ,z_k)=(\sum _j -\lambda _j z_j)\sum _j \lambda _j \log {z_j}\) for \(0 < z_1,\dots ,z_k \le 1\). Then, \(H(x)=\sum _i f(p_i(x))\) and \(\sum _j \lambda _j C(x,x_j) = \sum _i g(p_i(x_1),\dots ,p_i(x_k))\). In the same way, \(\sum _j \lambda _j \bar{C}(x,x_j) = \sum _i g(\bar{p}_i(x_1),\dots ,\bar{p}_i(x_k))\) and using the concavity of the logarithm, we have a term wise inequality \(f(p_i(x)) \le g(p_i(x_1),\dots ,p_i(x_k))\).
Lemma 2
We have the alternative bound \(H(x) \le \sum _j \lambda _j \bar{C}(x,x_j)\).
Proof
We shall see that \(f(p_i(x)) \le g(\bar{p}_i(x_1),\dots ,\bar{p}_i(x_k))\) for all i. Fix \(1 \le i \le N\) and assume that at least one \(p_i(x_l) \le \epsilon \) so that \(\bar{p}_i(x_l)=\epsilon \). Suppose first that \(\bar{p}_i(x) < e^{-1}\). Since \(p_i(x) \le \bar{p}_i(x)\), we have that \(f(p_i(x)) \le f(\bar{p}_i(x))\) and \(f(\bar{p}_i(x)) \le g(\bar{p}_i(x_1),\dots ,\bar{p}_i(x_k))\) by the concavity of the logarithm. Now suppose that \(\bar{p}_i(x) \ge e^{-1}\). In this case, \(g(\bar{p}_i(x_1),\dots ,\bar{p}_i(x_k))=-\bar{p}_i(x)\sum _j \lambda _j \log {\bar{p}_i(x_j)} \ge \bar{p}_i(x)(-\lambda _l \log {\epsilon }) \ge \bar{p}_i(x)\) since \(-\lambda _l \log {\epsilon } \ge 1\). Finally, \(\bar{p}_i(x) \ge e^{-1} \ge f(p_i(x))\).
B Appendix: Computing the Logarithm in MPC
To compute entropy, cross-entropy, and mutual information under MPC, we need to evaluate the logarithm, or more precisely the function \(z \mapsto z \log {z}\). We use base 2 logarithms instead of natural logarithms to measure information (the unit of information is bits). Converting to natural logarithms would cost one extra multiplication by \(\ln {2}\) per feature.
To approximate \(\log _2{z}\) we convert z to base-2 floating point \(z=w \cdot 2^p\) where p is an integer and \(1/2 < w \le 1\). Note that \(\log _2{z} = p + \log _2{w}\) and employ a Padé rational function approximation to approximate \(\log _2{w}\) to the desired precision [49]. This approximation applied to a secret-shared value \([z]\) is denoted \(\log _2 [z]\). Note that there exist alternative ways to approximate the logarithm, for example Taylor polynomials. Compared to Taylor expansions however, Padé approximants are rational functions that allow using lower degree polynomials to approximate the function up to a given order [49]. This means that fewer multiplications are needed in the MPC protocol. For instance, in our experiments in Sect. 6.2, the logarithm is approximated as a quotient of two cubic polynomials while agreeing with the logarithm up to order 6. We have also considered alternatives such as CORDIC type approximations [55], other iterative schemes [40] or using logarithm tables, but have chosen to use Padé approximations based on test implementations and the number of multiplications needed to achieve sufficient precision.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Eklund, D., Iacovazzi, A., Wang, H., Pyrgelis, A., Raza, S. (2024). BMI: Bounded Mutual Information for Efficient Privacy-Preserving Feature Selection. In: Garcia-Alfaro, J., Kozik, R., Choraś, M., Katsikas, S. (eds) Computer Security – ESORICS 2024. ESORICS 2024. Lecture Notes in Computer Science, vol 14983. Springer, Cham. https://doi.org/10.1007/978-3-031-70890-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-70890-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70889-3
Online ISBN: 978-3-031-70890-9
eBook Packages: Computer ScienceComputer Science (R0)