[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
Investigating Intelligent Call Technology for Dispatching Telephones Towards System Integration
Previous Article in Journal
Unmanned Aerial Vehicle Path Planning Using Acceleration-Based Potential Field Methods
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Masking and Homomorphic Encryption-Combined Secure Aggregation for Privacy-Preserving Federated Learning

Department of Computer Science and Engineering, Konkuk University, Seoul 05029, Republic of Korea
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(1), 177; https://doi.org/10.3390/electronics14010177
Submission received: 13 November 2024 / Revised: 29 December 2024 / Accepted: 30 December 2024 / Published: 3 January 2025
(This article belongs to the Special Issue Security and Privacy in Emerging Technologies)

Abstract

:
Secure aggregation of local learning model parameters is crucial for achieving privacy-preserving federated learning. This paper presents a novel and practical aggregation method that effectively combines the advantages of masking-based aggregation with those of homomorphic encryption-based techniques. Each node conceals its local parameters using a randomly selected mask, independently chosen, thereby eliminating the need for additional computations to generate or exchange mask values with other nodes. Instead, each node homomorphically encrypts its random mask using its own encryption key. During each federated learning round, nodes send their masked parameters and the homomorphically encrypted mask to the federated learning server. The server then aggregates these updates in an encrypted state, directly calculating the average of actual local parameters across all nodes without the necessity to decrypt the aggregated result separately. To facilitate this, we introduce a new multi-key homomorphic encryption technique tailored for secure aggregation in federated learning environments. Each node uses a different encryption key to encrypt its mask value. Importantly, the ciphertext of each mask includes a partial decryption component from the node, allowing the collective sum of encrypted masks to be automatically decrypted once all are aggregated. Consequently, the server computes the average of the actual local parameters by simply subtracting the decrypted total sum of mask values from the cumulative sum of the masked local parameters. Our approach effectively eliminates the need for interactions between nodes and the server for mask generation and sharing, while addressing the limitation of a single key homomorphic encryption. Moreover, the proposed aggregation process completes the global model update in just two interactions (in the absence of dropouts), significantly simplifying the aggregation procedure. Utilizing the CKKS (Cheon-Kim-Kim-Song) homomorphic encryption scheme, our method ensures efficient aggregation without compromising security or accuracy. We demonstrate the accuracy and efficiency of the proposed method through varied experiments on MNIST data.

1. Introduction

Federated learning [1] effectively protects the privacy of each user’s local data by only requiring the transmission of local model parameters, not the local data itself. However, local data can still be deduced from these parameters using the inverse attack [2,3,4], necessitating secure transmission of local parameters to the federated learning server. To conceal the local model parameters, several techniques have been proposed, including secure aggregation through secure multiparty computation [5,6], differential privacy [7,8], homomorphic encryption (HE) [9,10,11,12,13,14,15,16,17,18,19,20,21], and masking techniques [22,23,24,25,26,27,28,29]. Among these, mask-based aggregation and HE-based aggregation are the most widely adopted for ensuring privacy in federated learning.
The mask-based aggregation effectively conceals the actual local parameters by adding random mask values, which are removed post-aggregation, allowing only the computation of the sum of local updates. Although simple in computation, it requires an additional round for generating and sharing the mask values. The crucial aspect of this method is the efficient and secure generation of random masks that can be automatically removed after aggregation. To achieve this, each node must generate pairwise masks shared with all other nodes, necessitating an extra communication round for mask sharing. Moreover, to ensure robustness against dropouts, nodes must generate and share additional shares needed to reconstruct dropped masks.
HE-based aggregation allows nodes to encrypt their local parameters, with the federated learning server aggregating the updates in an encrypted form. Each node then retrieves the actual average of the parameters by decrypting the aggregated result. This method decreases the necessity for communication between nodes and the server for global model updates. However, HE computation is considerably more complex and computationally intensive than masking-based methods. Additionally, homomorphic operations are generally limited to ciphertexts encrypted using the same key. In federated learning, employing a single key for HE poses a security risk, as any malicious node or attacker who compromises a node could decrypt all ciphertexts produced by other nodes using the same key.
Multi-key homomorphic encryption (MKHE) [16] solves this problem by employing different keys for HE [15,16]. Homomorphic operations are possible on ciphertexts encrypted with different keys. However, decrypting these ciphertexts necessitates collaboration among all nodes. Each node contributes by generating a partial decryption using its own secret key, with the final decryption achieved by aggregating all these partial decryptions.
In this paper, we propose a novel and practical masking and HE-combined secure aggregation (MHESA) protocol that effectively integrates the advantages of both approaches and enables nodes to use different keys for HE. Each node’s local parameters are simply masked with a randomly chosen value by the node. Only the random mask is encrypted using the node’s own secret key, based on our proposed MKHE scheme. In each round of federated learning, each node transmits its masked parameters along with the homomorphically encrypted mask to the federated learning server. The server aggregates these updates by performing homomorphic additions on the ciphertexts, then obtains the sum of actual local parameters across all nodes without separately decrypting the aggregated result.
Compared to existing aggregation methods, our proposed method has several distinctions. First, our masking technique requires only one mask value per node, which is selected by the node itself, eliminating the need to create additional shared mask values. This approach removes any need for communication between nodes and the server to manage mask values.
Second, we introduce a new MKHE protocol based on CKKS (Cheon-Kim-Kim-Song) HE [21], specifically designed for secure aggregation in federated learning environments. In this protocol, each node creates its own private-public key pair while the server generates a group public key for all participating nodes. Since each mask is encrypted using an individual encryption key through our MKHE scheme, the ciphertext of each mask remains secure from all other participating nodes. Moreover, the individual keys of nodes are initially set in the setup phase of a new federated learning process and continue to be used until the termination of the entire process.
The essential feature of our model is that it does not require a separate decryption process to obtain the actual global update. The server calculates the total sum of the actual parameters through homomorphic addition of all ciphertexts provided by the nodes. In other words, decryption occurs automatically during homomorphic aggregation, overcoming the limitation of MKHE, which necessitates collaborative decryption from all nodes. To enable this, our proposed HE protocol produces a ciphertext that includes its partial decryption. Consequently, during the aggregation of the ciphertexts, the partial decryption components of individual ciphertexts are also aggregated simultaneously. As a result, the aggregated ciphertext is automatically decrypted after the aggregation. Importantly, not an individual ciphertext generated by each node but the total sum of ciphertexts is decrypted. Thus, the server can retrieve the actual sum of all original mask values by simply computing the sum of all ciphertexts. The server then computes the sum of the actual local parameters by subtracting the sum of mask values from the total sum of all masked parameters. In this process, the server determines only the sum of all mask values and cannot infer individual mask values from the ciphertexts. This information suffices for federated learning.
Thirdly, our protocol effectively addresses dropout scenarios without necessitating updates to individual keys. As previously mentioned, each ciphertext inherently includes its partial decryption. If a dropout occurs, the server cannot produce a correct aggregated result due to the absence of partial decryption components from the dropped nodes. Therefore, the remaining active nodes must regenerate their ciphertexts. In such instances, the server promptly generates a new group public key for the updated aggregation group and distributes it among the nodes in the new group. Subsequently, each node updates its local update with a new mask value and creates a new ciphertext of the mask value using its own key and the new group public key. Although nodes are required to regenerate and send their updated ciphertexts to the server, they do not need to alter or update individual encryption keys, even if the group of active nodes changes. This is a distinct advantage because, in our model, only the mask value is encrypted—not the local parameters, contrary to existing HE-based aggregation models that encrypt local parameters. In traditional models, active nodes must change their encryption keys to produce new ciphertexts for the same local parameters, necessitating additional computation and communication to update their keys. However, in our model, refreshing the mask value generates a new local update, eliminating the need for nodes to change their individual keys.
The main contributions of this paper can be summarized as follows:
  • Hybrid Secure Aggregation Strategy: We introduce a novel and practical secure aggregation strategy that integrates the advantages of both masking-based and HE-based aggregation methods. Compared to traditional masking-based techniques that often require complex collaborative mask management, our method eliminates the operational overhead associated with mask coordination and employs a new MKHE technique to encrypt masks. The encrypted masks are efficiently removed from the aggregated result using a homomorphic additive operation. This eliminates the need for collaborative operations while ensuring robust privacy for local model parameters.
  • Automatic Decryption and Efficient Global Model Update: The proposed MKHE technique allows the use of different keys and facilitates automatic decryption of the aggregated ciphertexts. Consequently, the server can directly retrieve the actual global model update without requiring an additional decryption process, a limitation seen in conventional MKHE models. Thus, our approach minimizes communication rounds, requiring only two interactions to complete a global model update under no-dropout conditions.
  • Key Management Independence: In our model, nodes can manage their keys independently of the aggregation group’s composition. In the existing MKHE-based aggregation techniques, where local parameters are directly encrypted, nodes need to update their keys whenever the aggregation group changes. This introduces considerable computational and operational overhead due to the need for the collaborative key updates among the newly formed group. In contrast, our method ensures that each node’s keys remain unchanged throughout the entire federated learning process. This is because only the mask value, not the local parameters, is encrypted, and individual mask values are independently refreshed by each node in every round of federated learning. Even in the presence of dropouts, nodes can perform the re-aggregation using their existing keys, eliminating the need for additional key update protocols.
We provide an overview of related works in Section 2, detail the protocols of our proposed model in Section 3, explore the correctness and security of our model in Section 4, analyze simulation performance regarding accuracy and efficiency, and conclude the paper in Section 5.

2. Related Works

We provide a brief overview of recent advancements in secure aggregation methods aimed at preserving privacy in federated learning, which are highly pertinent to our research.
The masking-based aggregation approach employs pairwise masks to conceal local parameters from the server. K. Bonawitz et al. [22,23] introduced an additive masking secure aggregation technique for federated learning, wherein users obscure their local updates with paired perturbations, removed during aggregation, allowing the server access only to the sum of all local updates. They further enhanced this method [23] to tolerate user dropouts by integrating Shamir’s secret sharing scheme. J. So et al. [28] developed turbo aggregation, performing circular aggregation across multiple groups and reducing communication overhead by limiting mask and data sharing to group members instead of all nodes. Additionally, J. Kim et al. [29] proposed a group-based aggregation strategy clustering nodes by similar response times based on their local processing time and locations, utilizing an additive masking technique to effectively address dropout situations without relying on Shamir’s secret sharing scheme and providing public verification of mask value integrity.
Y. Aono et al. introduced a deep learning algorithm [13] that uses HE to encrypt local model parameters, protecting the privacy of both local and global model parameters through homomorphic operations on ciphertexts, though it requires all participants to use the same private key for HE. H. Fang and Q. Qian proposed a HE-based federated learning strategy [14], also requiring a shared encryption key among all participants. In contrast, J. Park and H. Lim developed a federated learning model using HE [15], allowing participants to encrypt their local model parameters with individual private keys, enabling the server to update global model parameters using these variously encrypted local parameters within a distributed cryptosystem. However, this approach requires a third-party computation provider alongside the cloud server, necessitating collaboration between both parties to decrypt the encrypted local parameters.
W. Liu et al. introduced a round-efficient federated learning model [16] using multi-key fully homomorphic encryption (MKFHE), which enables computations on data encrypted across different parties and reduces the interactions required per federated learning round. Furthermore, nodes can dynamically join the homomorphic computation at any time by generating their own refreshing keys with the proposed multi-hop MKFHE. The refresh key converts old ciphertexts into new ciphertexts for newly formed node groups. In this process, each node must generate partial refresh keys for all others, and the server aggregates these into a single refresh key for each node. To minimize the size of encrypted local parameters, W. Jin et al. proposed a selective parameter encryption method [17] using HE in federated learning. This method selectively encrypts the most privacy-sensitive parameters using a HE key, although it still necessitates all nodes to use the same HE key.
Beyond these methods, several case studies on HE-based federated learning have been conducted. For instance, F. Wibawa et al. [18] developed a privacy-preserving method that integrates federated learning with HE to train convolutional neural network (CNN) models for COVID-19 detection. Local training takes place on lung X-ray images gathered from each hospital, and the model updates are transmitted and aggregated on the server in a homomorphically encrypted state. N. M. Hijazi et al. [19] proposed secure federated learning using fully HE in an Internet of Things (IoT) environment, while S. P. Sanon et al. [20] suggested a secure federated learning approach for training network traffic prediction models using encrypted network traffic. These implementations reveal that federated learning models incorporating HE can effectively and practically safeguard the privacy of local data in real-world applications.

3. MHESA: Masking and Homomorphic Encryption-Combined Secure Aggregation

The proposed model is built on the CKKS’s RLWE-based HE scheme (denoted as CKKS-HE in the following paper) [21]. Before describing our MHESA-based federated learning model, we first briefly describe the federated learning architecture used in this paper and the CKKS-HE scheme. We then present an overview of the proposed model and provide a detailed protocol.

3.1. Background and System Overview

3.1.1. Federated Learning Model

The federated learning system consists of a single central federated learning server (denoted as FS in the following paper) and N (mobile) users (or nodes). Throughout the rest of the paper, FS and U refer to the federated learning server and a subset of nodes, respectively, where each node is represented as ui.
FS trains a global model w   R d with dimension d using data stored on mobile devices. The goal of this training process is to minimize a global objective function F(w),
arg min w F w where   F w = i = 1 N x i x F i w .
Here, Fi is the local objective function of ui, xi represents the private data size of ui, and x = i x i . The local objective function Fi(w) for the global model w is defined as
F i w = 1 x i j = 1 x i f i w where   f i w = l X i , Y i ; w .
f i w represents the loss of the prediction on example ( X i ,   Y i ) made with model parameters w.
For a fixed learning rate η , FS trains the global model by iteratively performing distributed stochastic gradient descent (SGD) using the currently available mobile nodes. At iteration t, the server distributes the current global algorithm state (e.g., the current model parameters), w t , to the mobile nodes. Each ui then computes F i ( w t ) , which is the average gradient on its local data using the current model w t , and generates its local update w i t + 1 ,
w i t + 1 : = w t η F i ( w t ) .
ui iterates the local update several times before transmitting the update to FS. Subsequently, FS combines these gradients and updates the global model for the subsequent iteration,
w t + 1 : = i = 1 N x i x w i t + 1 = w t η i = 1 N x i x F i w t = w t η F w t ,
Since the loss gradient F ( w t ) can be expressed as a weighted average across nodes, F w t = i = 1 N x i x F i ( w t ) .

3.1.2. RLWE-Based HE Scheme

CKKS-HE leverages modular arithmetic and noise management as its core technique to enable secure computation on encrypted data. It represents data in ciphertexts that are defined over polynomial rings modulo a large integers. This ensures that encrypted computations remain within the bound on the encrypted system, allowing for secure and efficient addition and multiplication of encrypted values. Every operation on encrypted data introduces a small amount of noise into the ciphertext. CKKS-HE controls the growth of this noise by reducing the ciphertext’s modulus with its advanced noise management techniques such as rescaling and modulus switching. This makes the accumulated noise remain manageable and preserves the integrity and decryptability of the encrypted data. The detailed protocols for encryption and decryption are given follows. In this paper, since re-encryption and homomorphic multiplication operations are not required, descriptions of the related protocols such as evaluation key generation, rescaling, and modulus switching are omitted. CKKS-HE is based on the ring learning with errors (RLWE) assumption. For a modulus q and a base p > 0 , let q l = p l · q for a level 0 < l   L . For a positive integer M, Φ ( M ) is the M-th cyclotomic polynomial of degree n = ϕ ( M ) . R = Z X / ( X n + 1 ) is a power-of-two degree cyclotomic ring, and R q = Z q X / ( X n + 1 ) be the residue ring of R modulo q. A polynomial A x   R q L is defined as A x = 0 i < n a i · X i with the vector of its coefficients (a0, …, an−1) in Z q L n . The coefficient vector (a0, …, an−1) is denoted as A. For a real σ > 0, D(σ2) represents a distribution over R, sampling n coefficients independently from the discrete Gaussian distribution with variance σ2. For a positive integer h, HWT(h) is the set of signed binary vectors in 1 ,   0 ,   1 n whose Hamming weight is exactly h. For a real 0 ≤ ρ ≤ 1, the distribution ZO(ρ) draws each entry in the vector from 1 ,   0 ,   1 n , assigning a probability of ρ/2 for both −1 and +1, and a probability of 1 − ρ for 0.
Given the parameters (n, h, q, p, σ), CKKS-HE employs five key algorithms: KeyGen, Ecd, Dcd, Enc and Dec.
  • KeyGen(n, h, q, p, σ) samples sHWT(h), A R q L and eD(σ2). It sets the secret key as (1, s) and the public key pk as (B, A) in R q L 2 where B = −A·s + e (mod qL).
  • Ecd(z; Δ ) generates an integral plaintext polynomial m(X) for a given (n/2)-dimensional vector z = z j j T   Z i n / 2 . It calculates m X = μ 1 Δ · π 1 z μ R   R , where Δ   1 represents a scaling factor, and π is a natural projection defined by z j j Z M *     z j j T for a multiplicative subgroup T of Z M * satisfying Z M * / T = { ± 1 } . μ is a canonical embedding map from integral polynomial to elements in the complex field C n . It computes a polynomial whose evaluation values at the roots, the complex primitive roots of unity in the extension field C , correspond to the given vector of complex numbers.
  • Dcd(m; Δ ) returns the vector z = π   μ ( Δ 1 m ) for an input polynomial m in R, i.e., z j = Δ 1 m ς M j for j T .
  • Enc(m, pk), for a public key pk and a plaintext polynomial m, samples vZO(0.5), and e0, e1D(σ2), and outputs a ciphertext c = (c0, c1), where c0 = v·A + e0 (mod qL) and c1 = v·B + m + e1 (mod qL).
  • Dec(c, sk), for a ciphertext c = (c0, c1) and a secret key sk, outputs c1 + c0·s mod ql.

3.1.3. System Overview

We briefly outline our proposed federated learning model and detail the protocol in the subsequent section. The most notable features of the proposed model include (1) each node possesses its individual private and public key pair; (2) each node creates a masked local update, with only the mask value encrypted using a unique encryption key; and (3) the FS can directly derive a global model update by simply aggregating all local updates, without requiring further decryption.
To facilitate this, we propose a new MKHE scheme based on CKKS-HE, tailored to our federated learning model. We briefly describe the proposed MKHK scheme. In addition to the system parameters (n, h, q, p, σ) used in CKKS-HE, a public ring element A R q L and Q = q2 are additionally set. Given the parameters (n, h, q, p, σ, A, Q), the proposed MKHE scheme consists of six algorithms: KeyGen, GroupKeyGen, Ecd, Dcd, Enc and Dec. Here, Ecd and Dcd are identical to the algorithms of CKKS-HE, so the description of those algorithms is omitted.
  • KeyGen(n, h, q, p, σ, A, Q) generates the public-private key pair <pki, ski> and the commitment ci of each node ui. It samples siHWT(h), viZO(0.5) and ei, e0iD(σ2). It sets the secret key as ski = (1, si) and public key pki = −A·ski + ei (mod Q). Then, it sets a commitment ci for vi such as ci = A·vi + e0i (mod Q).
  • GroupKeyGen(PK, C, UT) generates a group public key PKT and a group commitment CT for a given node set U T U , where PK = {pki} and C = {ci} for all ui in U. It sets PKT as u j   U T p k j and CT as u j   U T c j .
  • Enc(mi, PKT, CT, ski) outputs a ciphertext Ei for a plaintext polynomial mi. For given group public key PKT and group commitment CT, it samples e1iD(σ2), and outputs a ciphertext E i = v i · P K T + m i + e 1 i + C T · s k i (mod Q).
  • Dec(ET, UT) adds all ciphertexts Ei in ET, where ET = {Ei} for all ui in UT. If ciphertexts obtained from all nodes in UT are added, it outputs the sum of plaintext polynomials generated by all nodes in UT such as ( m i + e ) .
The primary difference between the proposed protocol and the original CKKS-HE lies in its objective: the focus is not on decrypting individual ciphertexts but on decrypting the sum of ciphertexts. Therefore, instead of using individual public keys during encryption, a group public key and commitment—composed of the public keys and commitments of all participating nodes in the aggregation—are utilized. Additionally, to enable automatic decryption of the aggregated ciphertext sum without a separate decryption process, a partial decryption using individual secret keys is embedded into the ciphertext. Since the ciphertexts are encrypted with the group public key, individual ciphertexts cannot be decrypted unless the corresponding partial decryptions from all secret keys matching the group public key for individual plaintexts are combined. This approach ensures the confidentiality of individual ciphertexts while allowing the decryption of the aggregated ciphertext sum, thereby guaranteeing secure aggregation in federated learning environments.
Our federated learning model comprises two main phases: Setup and Learning.
Setup: This initial phase occurs when a new federated learning session starts. Its primary task is to generate system parameters and all necessary keys for our MKHE together with all nodes participating in the federated learning.
Learning: FS and nodes repeat this phase until the entire federated learning terminates. The Learning phase consists of two sub steps: initiation and aggregation. At the t-th (>0) round, in the initiation step, FS determines a set of nodes participating in the t-th round of Learning and generates a group public key for the nodes. Once the initiation is complete, the aggregation step begins. Each node ui updates its local model parameters wi and utilizes wi to generate a local update Di masked with a random secret Mi, and concurrently produces an encryption Ei of Mi. Node ui then transmits the tuple <Di, Ei> to FS. FS collects these updates and modifies the global model w by summing all local model parameters. In case any data is missing during aggregation, or a local update does not reach FS within the set aggregation period, the Learning phase is repeated using only the data from available nodes. Once the global model update is finalized, it is distributed to all nodes, and the Learning process repeats based on the updated global model.
We assume that nodes communicate solely with FS, and that both FS and the nodes operate under an ‘honest-but-curious’ model. Although they strictly follow the protocol, they remain continuously interested in extracting meaningful data from the interaction. In this threat model, the proposed MHESA satisfies specific security requirements:
(1)
Privacy of local datasets and model parameters: All data stored on each node’s local device and the local model parameters transmitted over the network must remain confidential, shielded not only from other nodes but also from FS. FS only has access to the aggregated sum of all local updates provided by the nodes.
(2)
Robustness to dropouts: In scenarios where data transmission is disrupted due to network issues or device malfunctions, FS must still be able to accurately compute a global model update.

3.2. MHESA Protocol

In this section, we elaborate on the specific details of the MHESA protocol. At the start of a new phase, FS initiates the Setup phase with all nodes in U.
Setup: FS collaboratively establishes system parameters and keys with each node ui as follows:
  • FS sets n, h, q and σ as described in Section 3.1.2 and generates Q = q2 and a public ring element A R Q with the vector of coefficients [a0, …, an−1] in Z Q n . FS publishes <n, h, q, σ, A, Q> to all nodes.
  • Each uiU generates its key pair <pki, ski> and a commitment ci by KeyGen(n, h, q, σ, A, Q), where ski = (1, si), pki = −A·ski + ei (mod Q) and ci = A·vi + e0i (mod Q). Then, ui responds to FS with <pki, ci, xi>, where xi represents the size of ui’s local dataset.
  • FS sets PK = {pki}, C = {ci} and X = {xi} for all ui in U.
Once Setup is completed, FS and nodes repeat the subsequent Learning phases until the federated learning process is concluded.
Learning: FS updates the global model by aggregating local model parameters from all available nodes, and nodes update their local models with the updated global model parameters. At the t-th > 0 iteration,
[Initiation]
  • FS sends a start message to all nodes.
  • All available nodes respond to FS with their xis, where xi represents the size of ui’s local dataset.
  • FS sets UT as the t-th node group for all replied nodes and generates the t-th group parameters PKT and CT for all nodes in UT by GroupKeyGen(PK, C, UT), where P K T = u i   U T p k i and C T = u i   U T c i . It also sets the total size of datasets as X T = u i   U T x i . Then, it broadcasts <PKT, CT, XT> to all nodes in UT.
[Aggregation]
4.
For each ui in UT, let w i t represent a set of local model parameters of ui at the t-th iteration. ui selects a random real number M i t R and generates a masked local update D i t according to the following Equation (5):
D i t = x i X T · w i t + M i t
Next, ui generates a plaintext polynomial m i t x = E c d ( M i t ; Δ ) and calculates the encryption of m i t , E i t = E n c m i t ,   P K T ,   C T ,   s k i , as shown in Equation (6).
E i t = v i · P K T + m i t + e 1 i + C T · s k i ( mod   Q )
Note that E i t includes both the encryption of m i t using the group public key PKT and the partial decryption by ui. ui sends < D i t , E i t > to FS.
5.
FS calculates D T t = u i U T D i t and E T t = u i U T E i t (mod q). Here, E T t = D e c E i t , U T = ( m i t + e ) . FS computes E T t = D c d E T t ; Δ = M i t . Finally, FS updates the t-th global update wt with the average of all local updates as described in Equation (7):
w t = D T t E T t = u i U T x i X T w i t
FS distributes w t to all nodes in U.
6.
Each ui updates its local model w i t + 1 with w t .
In our protocol, only homomorphic addition is required to aggregate all local updates, so generation of an evaluation key used in CKKS-HE is unnecessary. Furthermore, once the aggregation is complete, nodes refresh their masks and generate new ciphertexts for them. Therefore, no process involving rescaling or leveled homomorphic encryption required by CKKS-HE is needed. For simplicity, we used only two moduli Q and q for encryption and homomorphic addition.

3.3. Dropout Management

The existing aggregation protocol does not account for dropout situations. However, due to environmental network factors, the local updates from some nodes might not reach FS promptly, yet the protocol must remain robust in handling dropout nodes. Since FS must repeatedly perform the Learning phase, it cannot wait indefinitely for all local updates to arrive in each round. To address this, a predefined waiting time is set for receiving local parameters. If any local update fails to be transmitted within this timeframe, the corresponding node is considered a dropout node. This situation can arise in two scenarios: (1) the transmission of the local update may be interrupted due to network issues, or (2) the update is delayed and arrives after the predetermined waiting time. In the second scenario, FS can still receive the local update from a node it previously classified as a dropout. However, even if the local update from the dropout node eventually reaches FS, it is impossible for FS to determine the mask value or the actual local parameters from that update. Yet, complications arise when there is only a single dropout node and the global model parameters are updated using the remaining nodes, excluding the dropout. Let U denote the set of all nodes joined at the t-th round of the Learning phase, and let UD represent the set of nodes identified as dropouts. Then, UA represents the set of available nodes, defined as UUD. Suppose that UD = {ud} (i.e., a single dropout node) and FS updated the t-th global model parameter wA using only the nodes in UA. Consider that Dd and Ed, generated by ud, are subsequently received by FS. In this case, FS can compute an additional global update wU, incorporating all nodes in U, including ud, by adding Dd and Ed to the parameters Di and Ei previously provided by nodes in UA. Using this additional update, FS could then derive ud’s local parameter wd through the equation wd = wUwA. This scenario occurs when only a single dropout node exists. To prevent such leakage of local information, it is necessary to ensure that at least two sets of local parameters have to be hidden from FS, whenever dropouts occur.
Therefore, to effectively manage dropout nodes, we consider two situations: (1) when two or more dropout nodes occur, and (2) when only one dropout node occurs. In the second case, an additional node must be excluded from the aggregation. Since the local update of the excluded node is not reflected in the global update, the accuracy of the global model may be compromised. To minimize this impact, FS randomly selects a node to exclude from a group of nodes with relatively smaller local datasets. As will be explained in Section 4.2, our experiments results demonstrate that the overall accuracy of the federated learning model is mainly affected by the accuracy of individual local nodes. Furthermore, the accuracy of each local node is found to be highly sensitive to the size of its local dataset. Thus, FS sets UA by excluding one more node uj with relatively small xj (the size of dataset) from the actual UA. Then, FS updates PKA, CA and XA for UA and share these public parameters with nodes in UA. Then, each uj in UA selects a new Mj′ and generates a new pair of Dj′ and Ej′ using Mj′, PKA, CA and XA. With these updates, FS can compute the global update wA for UA as follows:
w A = D A E A = u i U A x i X A w i , where   D A = u i U A D i ,   E A = u i U A E i and   E A = D c d ( E A ;   Δ ) .
If some data from nodes in UD are never delivered to FS, meaning they are completely lost, then the global update wU for all nodes in U cannot be computed at all, and the previously delivered local updates from all nodes remain secure. However, on rare occasions, if all data deemed as dropouts are later delivered to FS, FS can compute both wA for UA and wU for U. In this situation, in the first scenario, FS remains unable to discern the individual local update of any node ud in UD, but FS can calculate wD = wUwA, which represents the cumulative sum of wd from all ud in UD. Even in a single dropout situation, UA is modified to ensure that there are at least two dropout nodes, so that wd of each dropout node remains secure.

4. Analysis

4.1. Correctness and Security

In this section, we first demonstrate the correctness of the proposed scheme and subsequently analyze the privacy of local model parameters. For the protocol to function properly, the federated learning server must calculate the sum of original mask values from the encrypted mask values. Consequently, we establish that the homomorphic sum E U t of all encrypted masks E i t accurately reveals the sum of all original m i t .
For P K U = u i   U p k i , let V U be u i   U v i and S K U be u i   U s k i . Intuitively, PKU becomes the group public key generated from the public keys of all nodes, and SKU is the corresponding group secret key for PKU, while VU is an aggregated group random vector for encryption. With these parameters, we derive the sum of all m i t from   E U t as follows:
1: E U t = u i U E i t (mod q)
2: = u i U v i · P K U + m i t + e 1 i + C U · s k i (mod q)
3: = P K U · v i + m i t + e 1 i + C U · s k i (mod q)
4: = P K U · V U + m i t + e 1 i + C U · S K U (mod q)
5: = p k i · V U + m i t + e 1 i + c i · S K U (mod q)
6: = ( A · s k i + e i ) · V U + m i t + e 1 i + ( A · v i + e 0 i ) · S K U (mod q)
7: = A · s k i · V U + e i · V U + m i t + e 1 i + A · v i · S K U + e 0 i · S K U (mod q)
8: = A · S K U · V U + e i · V U + m i t + e 1 i + A · V U · S K U + e 0 i · S K U (mod q)
9: = m i t + e i · V U + e 1 i + e 0 i · S K U (mod q)
10: = ( m i t + e i · V U + e 1 i + e 0 i i · S K U ) (mod q)
11: = D e c ( C i , S K u ) where C i = E n c m i t , P K U in CKKS-HE(mod q)
Decryption involves removing the public ring vector A from the ciphertext, resulting in a plaintext in the form of m + e. By summing all ciphertexts, as demonstrated in the equation at line 4, P K U · V U and C U · S K U are computed. These are equivalent to ( A · s k i + e i ) · V U and ( A · v i + e 0 i ) · S K U as shown in the equation at line 6, and then derived to A · S K U · V U and A · V U · S K U , respectively, as indicated in the equation at line 8. Consequently, the public ring vector is eliminated, leaving the sum of all plaintexts containing the error terms. Finally, as shown in the equation at line 10, the sum of E i t equals the sum of the values decrypted using CKKS’ decryption algorithm D e c ( C i ,   S K u ) for the ciphertext C i = E n c   m i t ,   P K U , which was encrypted by CKKS’ encryption algorithm for each node’s plaintext m i t . Finally, it accurately computes the sum of all original M i t through the CKKS’s decoding operation D c d E U t ;   Δ = M i t .
We demonstrate that our protocol maintains the privacy of local model parameters.
Theorem 1.
The proposed MHESA-based federated learning model preserves the privacy of local model parameters of nodes, if and only if the total number of nodes participating in the Learning phase is 4 or more, and the number of currently available (or active) nodes is 3 or more when dropouts occurred, in each iteration of Learning phase.
Proof. 
In our model, the number of active nodes participating in the Learning phase must be at least 2. For a given node set U, the group public key PKU for U is generated using the public keys of all nodes in U. When ciphertexts from all nodes in U are summed, the partial decryption parts of the ciphertexts are also summed and the decryption with the group private key SKU associated with PKU is completed. Thus, the summed ciphertext produces the cumulative sum of actual mask values contributed by all nodes in U.
If |U| = 1, the group public-private key pair <PKU, SKU> is identical to the public-private key pair <pk, sk> of the sole node in U. In this case, a ciphertext containing its decryption with sk directly reveals the plaintext itself. Thus, the proposed encryption scheme is unsuitable for a single node scenario. Conversely, when |U| ≥ 2, the decrypted value corresponds to the sum of plaintexts from all participating nodes in U. Because only the sum, and not individual plaintexts, is computed, the privacy of each node’s local update is preserved. Therefore, the model inherently requires a minimum of 2 active nodes to operate securely.
In dropout scenarios, as discussed in Section 3.3, the model assumes the presence of at least two dropout nodes when a dropout occurs. This may include one active node treated as a dropout. Consequently, at least 3 active nodes are required when a single dropout happens. For scenarios involving 2 or more dropout nodes, at least 2 active nodes are necessary to maintain the protocol. Therefore, the proposed model requires at least 4 nodes, including at least 3 active nodes, to ensure privacy-preserving federated learning.
Next, we prove that neither the federated learning server nor any other nodes can infer the original mask value from the local updates. As our proposed MKHE is based on CKKS-HE, the security of MKHE depends on the robustness of CKKS-HE. The distinction lies in the ciphertext of our MKHE, which includes a partial decryption. Hence, we assess whether this alteration could risk revealing the original mask value. Initially, we prove that it is impossible to ascertain m i t from its corresponding ciphertext E i t without the secret values vj and skj from another node uj.
The equation E i t = v i · P K U + m i t + e 1 i + C U · s k i consists of two components: a partial encryption of m i t and a partial decryption for m i t . Specifically, the portion of v i · P K U + m i t + e 1 i is ui’s partial encryption of m i t using the group public key PKU and ui’s random secret vi, while the section of C U · s k i is ui’s partial decryption for m i t with its secret key ski. Because m i t is encrypted with PKU, it can only be decrypted by the respective group secret key SKU. However, E i t includes only ui’s partial decryption C U · s k i . To derive m i t from E i t , the server must calculate C U · S K U and v i · P K U , requiring knowledge of ui’s vi and C U · s k j for all other nodes uj. Yet, without knowing skj and vj of uj, it is infeasible to compute C U · s k j and v j · P K U , making it impossible to determine m i t from E i t .
Second, we demonstrate that it is also impractical to identify individual m i t from all ciphertexts E i t . In the preceding proof, we established that determining m i t , v i · P K U , and C U · s k i from each E i t is impossible without knowledge of ski and vi of ui. Once E j t of all uj are delivered to the server, it can perform homomorphic operations on these ciphertexts. As demonstrated in the correctness proof, by aggregating all E j t , V U · P K U , and C U · S K U , the result is computed from the aggregated ciphertext, enabling decryption and revealing the total sum of the plaintexts m j t from all nodes u j . Here, the decrypted value is not an individual m j t but the sum of all m j t . The server cannot identify either V U · P K U or C U · S K U from the aggregated ciphertext because these values are automatically obscured after aggregation. In our protocol, each plaintext mi is encrypted and decrypted using ui’s secret values vi and ski, and the public group key is the resultant sum of these values from all nodes. Consequently, the server cannot decrypt a message mi for ui unless all other nodes uj generate Ej for mi using their vj and skj and provide these to the server. Thus, the proposed protocol safeguards the privacy of local model parameters, as it is impractical to extract individual m i t from the provided updates under the honest but curious threat model. □

4.2. Simulated Performance

In this section, we analyze the accuracy and efficiency of the proposed aggregation model using the MNIST [30] database. We compare the accuracy of the proposed aggregation model to that of a single centralized learning model. Additionally, we evaluate the accuracy of the proposed federated learning model by examining the number of nodes under two scenarios: IID and non-IID data distributions. To demonstrate the efficiency of the proposed model, we examine the computation time each node takes to generate its local update, the computation time the server requires to aggregate all local updates, and the size of data communicated between the server and the client. Specifically, we aimed to demonstrate the increase in computation and communication overhead at each local node and the corresponding communication delay due to the use of homomorphic encryption.

4.2.1. Experimental Environment

We first describe our experimental environment in detail. Two systems were used to implement the server and clients. The server system is equipped with an Intel(R) Core(TM) i7-12700K CPU, Intel(R) UHD Graphics 770 GPU, NVIDIA GeForce RTX 3080 GPU, and 32 GB of RAM. The client system is configured with an Intel(R) Core(TM) i9-7920X CPU, two NVIDIA GeForce GTX 1080Ti GPUs, and 32 GB of RAM. Multiple nodes were implemented by creating multiple threads on the client system, ensuring that all nodes have the same computing power and communication environment. Moreover, both systems are on the same network with a bandwidth of 1 Gbps, meaning that actual communication time, including latency, was not considered in the experiments.
For individual training of each node, a two-layer CNN model with 5 × 5 convolution layers (the first with sixteen channels, the second with thirty-two channels, each followed with 2 × 2 max pooling), ReLU activation, and a final softmax output layer was used. Local learning at the client side was implemented using the PyTorch framework while the federated learning at both sides was implemented in C++ with CKKS open codes. The MNIST database consists of a training set of 60,000 images along with a test set of 10,000 images. The data is distributed in two ways: IID and non-IID distributions, depending on the number of nodes participating in the federated learning. In the IID distribution, data is distributed evenly among the nodes, whereas in the non-IID distribution, the size of data allocated to each node varies. This leads to differences in execution times and, ultimately, varying response times to FS. This setup effectively simulates heterogeneous nodes with diverse computing environments. Table 1 shows the size of data allocated to nodes according to the number of nodes in both distributions.
To ensure 128 bits of security for CKKS-HE, we used a 16-bit n as the degree of the ring polynomial, an 800-bit q as the modulus for decryption, and a 1600-bit Q as the modulus for encryption. The server and clients completed 100 rounds of Learning phase (the global model updating phase) to finalize a new federated learning process. Additionally, we repeated the experiment 10 times to obtain an average result. Table 2 summarizes the experimental parameters and values used in our experiments.

4.2.2. Experimental Results

We first demonstrate the accuracy of our proposed model by analyzing the aggregated results using our MHESA compared to raw data aggregation. And then, we assess how differences in accuracy relate to the number of nodes and data distribution methods. Table 3 compares the accuracy of our model with that of raw data aggregation by round. In this experiment, there are 10 nodes, and data is evenly distributed, with each node using 6000 images for local learning. We obtained the average accuracy of the two models through 100 repeated experiments.
As Table 3 shows, the accuracy difference is very slight, approximately 0.02%, with multiple rounds where our model demonstrates higher accuracy. This variance is due to the random assignment of data to each node in every experiment, rather than the aggregation method itself. Over 100 rounds, the accuracy for both methods averages around 97.7%, indicating negligible accuracy loss due to our aggregation approach using masking and HE. Table 4 shows the average accuracy of our proposed model based on the number of nodes under both IID and non-IID distributions. N indicates the number of nodes.
As depicted in Table 4, the accuracy markedly depends on the number of nodes. With N = 10, the accuracy of federated learning is 97.74%, slightly lower than the 99.01% seen when N = 1, representing a centralized single model. As the number of participating nodes increases, each node receives less data, leading to decreased local learning accuracy. This is further analyzed to result in a reduction in the overall accuracy of federated learning. Particularly, as the node count grows, the accuracy for non-IID distributions is slightly higher than for IID. In non-IID scenarios, as Table 3 suggests, some nodes receive significantly more data than in the IID setup, enabling these nodes to achieve greater local learning accuracy, thereby enhancing the overall accuracy of federated learning. The local accuracy of each node influences the overall performance, suggesting that if individual nodes develop effective local models with their independent data, federated learning with these nodes yields more accurate results. Figure 1 illustrates the accuracy of the proposed model according to the number of nodes per round in IID and non-IID distributions.
We next evaluate the effectiveness of the proposed model by examining the size of communication data and computation time. The model encompasses Setup and Learning phases; the Setup phase occurs only once at the onset of federated learning. A critical task during the Setup phase involves each node generating its own public key <pki, ci> after the server has created the public ring A. Subsequently, the server crafts a group public key <PKU, CU> using these keys.
The Learning phase is carried out repeatedly in each round of federated learning. During this phase, each node produces and forwards to the server Di, which is masked with a randomly chosen mask Mi for the local parameters wi along with the ciphertext Ei of Mi. The server aggregates them, computes DU (the sum of Di) and EU (the sum of Ei), then calculates the global update w from these values, and distributes it to the nodes. The exact sizes of these parameters at the key stages are itemized in the ensuing Table 5.
In the Setup phase, the public ring parameter A transmitted by the server to each node is approximately 12.5 MB, and the combined size of PKU and CU is about 19.5 MB, independent of the node count. Similarly, the pair of pki and ci transmitted by each node to the server is also roughly 19.5 MB. During the Learning phase, each node also sends approximately 12.5 MB of the ciphertext Ei along with about 0.17 MB of the masked local parameters Di. The global update w, which the server sends back to the nodes, is approximately 0.17 MB. Compared to the masking-based aggregation model, the size of the data transmitted during the aggregation process in the proposed model is significantly larger. In the masking-based aggregation model, only the masked local parameters are transmitted, making the size of the transmitted data proportional to the size of the local parameters, which is similar to the size of Di (0.17 MB) in our experiments. In the proposed model, both the masked parameters Di and the ciphertext of the mask Ei are transmitted. While Di is proportional to the size of the local parameters, Ei is independent of the size of the local parameters and is instead determined by the ring polynomial used in HE. Therefore, even if the number of local parameters increases, the size of Ei remains constant. However, to ensure the security of the proposed HE, a 16-bit degree ring polynomial, along with 800-bit and 1600-bit moduli, is required, which inevitably increases the size of the transmitted data.
We further analyze the actual time required to perform the main operations associated with HE, which represent the most time-intensive computations in our aggregation model. During the Setup phase, we determined the average time for the server to generate the public ring polynomial A, the average time for each node to compute and send its own pki and ci pair, and the average time for the server to compute and dispatch the group public key pair, PKU and CU. For the Learning phase, we assessed the average time for each node to calculate and transmit its local update pair, Di and Ei, and the average time for the server to compute the global update by aggregating all Dis and Eis. Table 6 presents a summary of the actual average times required to execute these principal operations. These durations are the averages of all times recorded in individual experiments conducted with different numbers of nodes (N = 10, 20, 50, and 100) under both IID and non-IID conditions.
As indicated in Table 6, the computation time for generating ring parameters is approximately 35 ms, while generating individual public key pairs takes about 115 ms. Masking the local parameters and encrypting the mask value requires around 60 ms. All these computations are performed independently at each node, and the computation time is determined by the degree of the ring polynomial used in HE, regardless of the number of nodes. Therefore, even as the scale of federated learning grows, the execution time for each node remains the same. On the other hand, the time required for FS to aggregate all local updates scales proportionally with the number of nodes, with each addition operation lasting about 5 to 7 ms. While the total aggregation time increases with the number of nodes, the time required for a single addition operation is relatively short. As a result, even if the scale of federated learning grows to thousands of nodes, aggregation can still be completed within a few seconds.
Nonetheless, the data transmission time between the server and nodes is significantly longer, attributable to the large size of the transmitted data, ranging from 590 ms to 1569 ms. Although communication time is highly dependent on the specific networking environment, the experimental results obtained under the same local network conditions may not be broadly applicable. Nevertheless, it is clear that HE increases the communication load, leading to longer communication times. Excluding communication time, the duration required for each node to generate its local update, and the time for the server to aggregate all local updates during each Learning phase, is relatively short, demonstrating that the proposed model is feasible for real-world applications.
Figure 2 presents the total aggregation time based on the number of nodes under the non-IID distribution. The total aggregation time represents the duration required to complete 100 rounds of the Learning phase. This includes the time taken by each node to update its local model, generate local updates (including mask encryption), and transmit these updates, in each round. In the non-IID distribution, since the size of local dataset of each node is different, the execution time of each node is different, resulting in different response times to FS. We evaluated two scenarios: one with dropout occurrences and one without any dropouts. To handle dropouts, a waiting time threshold for each round must be defined. If a node failed to deliver updates to FS within the pre-defined waiting time, it was classified as a dropout. The waiting time per round was determined based on the maximum round time observed during our experimental measurements. In the absence of dropouts, the aggregation proceeded to the next step immediately after receiving updates from all nodes, even if the designated waiting time had not elapsed. Conversely, in the presence of dropouts, FS waited for the full pre-defined waiting time before initiating re-aggregation with the remaining active nodes. As the number of nodes increased, the time required for FS to receive and aggregate updates also grew, leading to a proportional increase in the total aggregation time. When no dropout occurred, for n = 10, the average total aggregation time was approximately 933 s; for n = 20, it was 2018 s; for n = 50, 7092 s; and for n = 100, 13,499 s.
For a dropout scenario, when assumed a 10% dropout rate every 5 rounds, the average total aggregation time was as follows: n = 10, 1129 s; n = 20, 2434 s; n = 50, 8528 s; and n = 100, 16,436 s. These results demonstrate that both the number of nodes and the occurrence of dropouts significantly impact the total aggregation time. Dropouts, in particular, introduce additional waiting times and computational overhead, exacerbating the aggregation delays as the scale of the system grows.
Finally, we compare the key features of our proposed model with existing aggregation models. We employ the model by K. Bonawitz et al. [22], which represents a masking-based aggregation strategy, alongside two recently proposed HE-based aggregation strategies by J. Park and H. Lim [15], and W. Liu et al. [16]. These HE-based models enable the use of different encryption keys for nodes. Table 7 below summarizes this comparison.
In the masking-based aggregation method, each node must generate masks for all participating nodes, leading to computational and communication overhead that increases proportionally with the total number of nodes. Additionally, mask sharing among all nodes must be completed before aggregation, making this approach unsuitable for self-adaptive models where the composition of participating nodes can dynamically change.
In the MKHE-based aggregation method, individual encryption keys are used by each node to enhance the security of local updates. However, the decryption process requires all nodes to provide partial decryptions for the aggregated ciphertext, adding complexity. Furthermore, when the aggregation group changes, the nodes in the new group must perform key refreshing processes, which can be inefficient in environments where node composition frequently changes.
Compared to existing aggregation models, the main advantage of our model is its ability to simplify and streamline the aggregation process. The proposed model significantly reduces the number of interactions between the server and nodes. Specifically, only two interactions are required to complete the global model update, provided there are no dropouts. While the proposed model simplifies the aggregation process, the use of homomorphic encryption results in an increase in data size, which may impact communication efficiency. However, the generation of local updates is performed independently on each node, and the size of the data depends solely on the security strength of the homomorphic encryption and the local model parameters. Therefore, the computational and communication overhead for each node is independent of the total number of participating nodes. Moreover, even when the aggregation group changes, the server only needs to update and distribute the group key for the new node group. This makes the proposed model particularly useful for self-adaptive models, where node configurations are dynamic and require flexible and efficient aggregation. Although the proposed model has the drawback of increased data size, it improves communication rounds and offers scalability for varying numbers of nodes and dynamic node groups.

5. Discussion and Conclusions

In this paper, we have proposed a novel and practical aggregation method for federated learning, based on masking and HE techniques. This scheme preserves the privacy of local model parameters by masking them with a user-chosen random value, and the masking values are securely encrypted and eliminated using the proposed MKHE technique. The scheme does not require communication or data sharing among nodes and minimizes interactions between the nodes and the server. It enhances the security of user-chosen masking values by allowing nodes to use different encryption keys, and simplifies the aggregation process by enabling the server to compute the aggregated result of the actual local parameters without needing to perform a separate decryption procedure. Even if dropouts occur, the remaining active nodes can independently generate new updates and re-aggregate them without changing their encryption keys. As a result, our model requires only two interactions to complete the global model update, provided there are no dropouts. We believe that the proposed model is highly significant as it minimizes the inefficiency caused by using MKHE techniques and maximizes practicality and security for secure aggregation in federated learning.
We conducted experiments to evaluate the aggregation accuracy and computational efficiency of the proposed model. Compared to raw data aggregation accuracy, no decrease in accuracy was observed with our masking and HE-based aggregation. In both cases, the accuracy remained at approximately 97.7% under identical conditions. The model’s accuracy is significantly influenced by the number of nodes, achieving a peak accuracy of approximately 97.74% when N = 10 under IID distribution. This accuracy is about 1.27% lower than that of a centralized single model when N = 1. The overall accuracy of federated learning from our experiments depends on the accuracy of each node’s local model. We anticipate improvements in federated learning overall accuracy if each node can develop a robust local model using their independent training data.
The time required to complete a single round of our aggregation protocol has been assessed. On average, computing main parameters related to HE, such as a ring parameter, key pair, and local update of each node, took between 35 ms and 115 ms. However, the average transmission time of these parameters between the server and nodes was notably longer, ranging from 590 ms to 1580 ms when performed within the same local network environment. This delay is attributed to the increased data size due to HE. In terms of security, our experiments utilized a 16-bit ring polynomial, an 800-bit decryption modulus, and a 1600-bit encryption modulus. The sizes of the ring parameter A, the public key pki, and the encrypted mask Ei were approximately 13 MB, whereas the commitment ci was about 6.5 MB and the masked local parameter Di was roughly 0.17 MB. Given the significant data volumes involved, communication delays are likely as the communication load increases; however, these delays are effectively mitigated by minimizing the number of communication rounds. Compared to existing aggregation models, our proposed model significantly reduces the number of interactions between the server and nodes, requiring only two interactions to complete the global model update, provided there are no dropouts. We streamlined the aggregation process to its optimal point. and believe our model considerably enhances the efficiency and real-world applicability of the overall aggregation process.
We aim to establish a secure and practical federated learning model for medical diagnosis. As a potential solution, we proposed the MHESA-based federated learning model. Given the critical importance of privacy protection in medical data, federated learning is highly appropriate for building comprehensive medical diagnostic models using the distinct datasets held by individual hospitals. Hospitals often possess substantial amounts of differentiated medical data and sufficient computational infrastructure to perform local learning independently. Furthermore, federated learning for medical data does not require frequent or real-time processing, making it relatively resilient to communication delays. Therefore, hospitals can periodically engage in federated learning using the proposed model. Each hospital updates its local model and transmits it to the federated server (FS), which promptly updates the global model by aggregating the data from all participating hospitals and sharing the updated global model with them. While the use of homomorphic encryption may result in increased computational and transmission costs, these are manageable within the scope of hospital-level systems. Communication delays can also be mitigated by allowing adequate waiting times. Most importantly, the proposed model eliminates the need for collaborative operations between hospitals and minimizes communication rounds between hospitals and the federated learning server, thereby enhancing the overall efficiency of the federated learning for medical data.
The drawback of the proposed method is that, if only one node is determined as dropout, one other active node must be also treated as dropout to ensure the privacy of the dropout node. This issue needs to be improved so that all active nodes can participate in the aggregation under any disruption. In addition, the experiments in this paper were conducted on 100 nodes within the same network, which imposes limitations on the analysis of communication delays in real federated learning environments. In each round of federated learning, a waiting time must be set to receive data from all nodes and to identify dropout nodes. Determining an appropriate waiting time requires consideration of various factors, including the computational workload of each local node, computing power, data size for transmission, and the communication environment. Based on these factors, an analysis of communication delays in actual network conditions is necessary.
In future work, we plan to address the issue of determining an appropriate waiting time, considering real-world communication environments, as part of improving the proposed model. Also, we will continue to develop methods to improve dropout management, reduce communication overhead and validate the aggregation result.

Author Contributions

Conceptualization, S.P.; methodology, S.P., J.L. and K.H.; software, J.L. and K.H.; validation, S.P., J.L., K.H. and J.C.; formal analysis, S.P.; data curation, J.C.; writing—original draft preparation, S.P., J.L. and K.H.; writing—review and editing, S.P. and J.C.; supervision, S.P.; project administration, S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (NRF-2021R1F1A1063172).

Data Availability Statement

The original data presented in the study are openly available in http://yann.lecun.com/exdb/mnist (accessed on 20 November 2024).

Conflicts of Interest

The authors declare no conflict of interest in this paper.

References

  1. McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA, 9–11 May 2017; Volume 54. [Google Scholar]
  2. Zhu, L.; Liu, Z.; Han, S. Deep leakage from gradients. arXiv 2019, arXiv:1906.08935. [Google Scholar]
  3. Wang, Z.; Song, M.; Zhang, Z.; Song, Y.; Wang, Q.; Qi, H. Beyond inferring class representatives: User-level privacy leakage from federated learning. In Proceedings of the IEEE INFOCOM, Paris, France, 29 April–2 May 2019; pp. 2512–2520. [Google Scholar]
  4. Geiping, J.; Bauermeister, H.; Dröge, H.; Moeller, M. Inverting gradients—How easy is it to break privacy in federated learning? In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Online, 6–12 December 2020.
  5. Yao, A.C. Protocols for secure computations. In Proceedings of the 23rd IEEE Annual Symposium on Foundations of Computer Sciecne (sfcs 1982), Chicago, IL, USA, 3–5 November 1982; pp. 160–164. [Google Scholar]
  6. Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [Google Scholar] [CrossRef]
  7. Geyer, R.C.; Klein, T.; Nabi, M. Differentially private federated learning: A client level perspective. In Proceedings of the NIPS 2017 Workshop: Machine Learning on the Phone and other Consumer Devices, Long Beach, CA, USA, 8 December 2017. [Google Scholar]
  8. Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.S.; Poor, H.V. Federated Learning with Differential Privacy: Algorithms and Performance Analysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3454–3469. [Google Scholar] [CrossRef]
  9. Leontiadis, I.; Elkhiyaoui, K.; Molva, R. Private and dynamic timeseries data aggregation with trust relaxation. In Proceedings of the International Conferences on Cryptology and Network Security (CANS 2014), Seoul, Republic of Korea, 1–3 December 2010; Springer: Berlin/Heidelberg, Germany, 2014; pp. 305–320. [Google Scholar]
  10. Rastogi, V.; Nath, S. Differentially private aggregation of distributed time-series with transformation and encryption. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 10), Indianapolis, IN, USA, 6–10 June 2010; pp. 735–746. [Google Scholar]
  11. Halevi, S.; Lindell, Y.; Pinkas, B. Secure computation on the Web: Computing without simultaneous interaction. In Advances in Cryptology—CRYPTO 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 132–150. [Google Scholar]
  12. Leontiadis, I.; Elkhiyaoui, K.; Önen, M.; Molva, R. PUDA—Privacy and Unforgeability for Data Aggregation. In Cryptology and Network Security. CANS 2015; Springer: Cham, Switzerland, 2015; pp. 3–18. [Google Scholar]
  13. Aono, Y.; Hayashi, T.; Wang, L.; Moriai, S. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 2017, 13, 1333–1345. [Google Scholar]
  14. Fang, H.; Qian, Q. Privacy Preserving Machine Learning with Homomorphic Encryption and Federated Learning. Future Internet 2021, 13, 94. [Google Scholar] [CrossRef]
  15. Park, J.; Lim, H. Privacy-Preserving Federated Learning Using Homomorphic Encryption. Appl. Sci. 2022, 12, 734. [Google Scholar] [CrossRef]
  16. Liu, W.; Zhou, T.; Chen, L.; Yang, H.; Han, J.; Yang, X. Round efficient privacy-preserving federated learning based on MKFHE. Comput. Stand. Interfaces 2024, 87, 103773. [Google Scholar] [CrossRef]
  17. Jin, W.; Yao, Y.; Han, S.; Gu, J.; Joe-Wong, C.; Ravi, S.; Avestimehr, S.; He, C. FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning System. arXiv 2023, arXiv:2303.10837. [Google Scholar]
  18. Wibawa, F.; Catak, F.O.; Kuzlu, M.; Sarp, S.; Cali, U. Homomorphic Encryption and Federated Learning based Privacy-Preserving CNN Training: COVID-19 Detection Use-Case. In Proceedings of the 2022 European Interdisciplinary Cybersecurity Conference (EICC’22), Barcelona, Spain, 15–16 June 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 85–90. [Google Scholar] [CrossRef]
  19. Hijazi, N.M.; Aloqaily, M.; Guizani, M.; Ouni, B.; Karray, F. Secure Federated Learning With Fully Homomorphic Encryption for IoT Communications. IEEE Internet Things J. 2024, 11, 4289–4300. [Google Scholar] [CrossRef]
  20. Sanon, S.P.; Reddy, R.; Lipps, C.; Schotten, H.D. Secure Federated Learning: An Evaluation of Homomorphic Encrypted Network Traffic Prediction. In Proceedings of the IEEE 20th Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2023; pp. 1–6. [Google Scholar] [CrossRef]
  21. Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic Encryption for Arithmetic of Approximate Numbers. In Advances in Cryptology-ASIASCRYPT 2017; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2017; Volume 10624. [Google Scholar] [CrossRef]
  22. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedoney, A.; McMahan, H.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical Secure Aggregation for Federated Learning on User-Held Data. arXiv 2016, arXiv:1611.04482. [Google Scholar]
  23. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the ACM SIGSAC Conferences on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191. [Google Scholar]
  24. Ács, G.; Castelluccia, C. I have a DREAM! (DiffeRentially privatE smArt Metering). In Information Hiding. IH 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 118–132. [Google Scholar]
  25. Goryczka, S.; Xiong, L. A comprehensive comparison of multiparty secure additions with differential privacy. IEEE Trans. Dependable Secur. Comput. 2017, 14, 463–477. [Google Scholar] [CrossRef] [PubMed]
  26. Elahi, T.; Danezis, G.; Goldberg, I. Privex: Private collection of traffic statistics for anonymous communication networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AZ, USA, 3–7 November 2014; pp. 1068–1079. [Google Scholar]
  27. Jansen, R.; Johnson, A. Safely Measuring Tor. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 1553–1567. [Google Scholar]
  28. So, J.; Güler, B.; Avestimehr, A.S. Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning. arXiv 2020, arXiv:2002.04156. [Google Scholar] [CrossRef]
  29. Kim, J.; Park, G.; Kim, M.; Park, S. Cluster-Based Secure Aggregation for Federated Learning. Electronics 2023, 12, 870. [Google Scholar] [CrossRef]
  30. LeCun, Y.; Cortes, C.; Burges, C.J. MNIST Handwritten Digit Database. 2010. Available online: http://yann.lecun.com/exdb/mnist (accessed on 4 October 2024).
Figure 1. Comparison of accuracy based on the number of nodes in IID and non-IID cases. (a) IID Accuracy. (b) Non-IID Accuracy.
Figure 1. Comparison of accuracy based on the number of nodes in IID and non-IID cases. (a) IID Accuracy. (b) Non-IID Accuracy.
Electronics 14 00177 g001
Figure 2. Total aggregation time by the number of nodes with non-IID distribution.
Figure 2. Total aggregation time by the number of nodes with non-IID distribution.
Electronics 14 00177 g002
Table 1. The size of data allocated to nodes.
Table 1. The size of data allocated to nodes.
Number of NodesData Size per Node
IID DistributionNon-IID Distribution
MinimumMaximum
160,00060,000
10600050012,700
2030004004800
5012001002200
1006001001200
Table 2. Experimental parameters and values.
Table 2. Experimental parameters and values.
ParametersValues
The number of nodes1, 10, 20, 50, 100
The number of rounds for federated learning100
The number of weight parameters 21,840
The size of n as the degree of ring polynomial (bits)16
The size of q as the modulus for decryption (bits)800
Size of Q as the modulus for encryption (bits)1600
Table 3. The average accuracy of the proposed model versus raw data aggregation.
Table 3. The average accuracy of the proposed model versus raw data aggregation.
RoundAccuracy of
the Proposed Model (N = 10, IID) (%)
Accuracy of
Raw Data Aggregation (%)
1091.28 (0.10) *91.27 (0.11) *
2094.33 (0.08) *94.32 (0.07) *
4096.36 (0.05) *96.36 (0.05) *
6097.08 (0.05) *97.09 (0.05) *
8097.42 (0.05) *97.43 (0.04) *
10097.75 (0.04) *97.74 (0.05) *
* The values in parentheses represent the standard deviation.
Table 4. The average accuracy of the proposed model according to the number of nodes.
Table 4. The average accuracy of the proposed model according to the number of nodes.
Average Accuracy (%)
NN = 1N = 10N = 20N = 50N = 100
RoundIIDNon-IIDIIDNon-IIDIIDNon-IIDIIDNon-IID
1097.6991.2787.9684.2987.4753.6566.9816.2635.55
(0.07) *(0.19) *(2.55) *(0.15) *(0.17) *(0.46) *(2.05) *(0.2) *(3.67) *
2098.3594.2792.1591.1991.1976.7086.1547.2166.09
(0.08) *(0.03) *(3.03) *(0.07) *(0.18) *(0.03) *(0.32) *(0.76) *(1.62) *
4098.7896.3394.7294.2795.0589.5991.2771.7784.70
(0.06) *(0.04) *(2.11) *(0.04) *(0.12) *(0.12) *(0.16) *(0.34) *(0.43) *
6098.9497.0895.6895.5496.2192.1093.0685.7788.9
(0.02) *(0.04) *(1.81) *(0.02) *(0.06) *(0.06) *(0.14) *(0.12) *(0.19) *
8099.0297.5396.2896.3196.6793.3594.1388.6590.74
(0.07) *(0.05) *(1.56) *(0.03) *(0.1) *(0.04) *(0.20) *(0.05) *(0.12) *
10099.1697.7496.796.7397.0594.1394.9390.3991.79
(0.05) *(0.05) *(1.32) *(0.04) *(0.1) *(0.03) *(0.22) *(0.03) *(0.15) *
* The values in parentheses represent the standard deviation.
Table 5. Data sizes of main parameters.
Table 5. Data sizes of main parameters.
ParametersSize (Byte)
A13,107,200 (=12.5 MB)
pki, PKU13,631,488 (=13 MB)
ci, CU6,815,744 (=6.5 MB)
wi, Di, DU, w174,720 (=0.17 MB)
Ei, EU13,107,200 (=12.5 MB)
Table 6. The actual average time to perform main operations related to HE.
Table 6. The actual average time to perform main operations related to HE.
PhaseActorOperationsAverage Time (ms)
SetupServerCreating a ring polynomial A34.9399
Sending A to each node1568.971
NodeComputing pki and ci pair 114.8651
Sending pki and ci to the server1117.554
ServerTPKC: adding a single pki to PKU and adding a single ci to CU 4.763361
Computing PKU and CU for all N nodes N·TPKC
Transmitting PKU and CU to each node1263.832
LearningNodeComputing the Di and Ei pair 60.08073
Transmitting Di and Ei to the server590.122
ServerTD: Adding a single Di to DU0.026612
Computing DU for all N nodes N·TD
TE: Adding a single Ei to EU7.099121
Computing EU for all N nodesN·TE
Computing w from DU and EU, including decoding EU0.15177
Table 7. Comparison of the key features of the proposed model with existing aggregation models.
Table 7. Comparison of the key features of the proposed model with existing aggregation models.
FeaturesThe Proposed ModelMasking-Based
Aggregation by
K. Bonawitz et al. [22]
HE-Based
Aggregation by
J. Park and H. Lim [15]
MKFHE-Based
Aggregation by
W. Liu et al. [16]
Privacy-preserving strategies
for local parameters
Masking and HE
(based on CKKS-HE)
MaskingHE
(based on a Distributed Homomorphic Cryptosystem)
HE
(based on CKKS-HE)
Use of
additional third parties
XXComputation
Provider (CP)
X
The number of
interactions
required for
aggregation
(Assume that the set of nodes for aggregation has been
determined and no dropouts
occurred)
Number2 444
InteractionsN→S: send local updates
S→N: send global update
N→S: send masks for other nodes
S→N: distribute masks
N→S: transmit masked local update
S→N: transmit global update
N→S: transmit encrypted local update
S→CP: transmit partially decrypted aggregation
CP→S: transmit encrypted sum
S→N: transmit encrypted global update
N→S: transmit encrypted local update
S→N: transmit encrypted aggregation
N→S: transmit partial decryption
S→N: transmit global update
Type of global update retrieved by the serverGlobal updateGlobal updateEncrypted
global update
Global update
Use of different keys
for nodes
ON/AOO
Update encryption key
when dropout occurs
XN/AN/AO
Decryption processXN/AO
(each node needs to decrypt the encrypted global update)
O
(each node needs to partially decrypt the given aggregation)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Park, S.; Lee, J.; Harada, K.; Chi, J. Masking and Homomorphic Encryption-Combined Secure Aggregation for Privacy-Preserving Federated Learning. Electronics 2025, 14, 177. https://doi.org/10.3390/electronics14010177

AMA Style

Park S, Lee J, Harada K, Chi J. Masking and Homomorphic Encryption-Combined Secure Aggregation for Privacy-Preserving Federated Learning. Electronics. 2025; 14(1):177. https://doi.org/10.3390/electronics14010177

Chicago/Turabian Style

Park, Soyoung, Junyoung Lee, Kaho Harada, and Jeonghee Chi. 2025. "Masking and Homomorphic Encryption-Combined Secure Aggregation for Privacy-Preserving Federated Learning" Electronics 14, no. 1: 177. https://doi.org/10.3390/electronics14010177

APA Style

Park, S., Lee, J., Harada, K., & Chi, J. (2025). Masking and Homomorphic Encryption-Combined Secure Aggregation for Privacy-Preserving Federated Learning. Electronics, 14(1), 177. https://doi.org/10.3390/electronics14010177

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop