[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
New Insights into Laryngeal Articulation and Breathing Control of Trumpeters: Biomedical Signals and Auditory Perception
Next Article in Special Issue
Chosen-Ciphertext Secure Unidirectional Proxy Re-Encryption Based on Asymmetric Pairings
Previous Article in Journal
Beyond Presence: Exploring Empathy within the Metaverse
Previous Article in Special Issue
Protecting Instant Messaging Notifications against Physical Attacks: A Novel Instant Messaging Notification Protocol Based on Signal Protocol
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Reliable Aggregation Method Based on Threshold Additive Secret Sharing in Federated Learning with Quality of Service (QoS) Support

by
Yu-Ting Ting
,
Po-Wen Chi
* and
Chien-Ting Kuo
Department of Computer Science and Information Engineering, National Taiwan Normal University, Taipei 116, Taiwan
*
Author to whom correspondence should be addressed.
IEEE Member.
Appl. Sci. 2024, 14(19), 8959; https://doi.org/10.3390/app14198959
Submission received: 12 August 2024 / Revised: 27 September 2024 / Accepted: 2 October 2024 / Published: 4 October 2024
(This article belongs to the Special Issue Cryptography in Data Protection and Privacy-Enhancing Technologies)
Figure 1
<p><span class="html-italic">t</span>-out-of-<span class="html-italic">n</span> federated learning architecture diagram. In this example, <span class="html-italic">t</span> is 2 and <span class="html-italic">n</span> is 4. After training the model locally, the client divides the gradients into 4 shares and distributes them to all servers. Each server aggregates the received shares. The selected 2 servers, the leftmost server and the rightomst server here, then return the aggregated shares to the client, which aggregates the <span class="html-italic">t</span> received shares to update the model.</p> ">
Figure 2
<p><math display="inline"><semantics> <msub> <mi mathvariant="normal">C</mi> <mi mathvariant="normal">i</mi> </msub> </semantics></math> splits the gradients after local training and sends the gradients and data volume to <math display="inline"><semantics> <msub> <mi>Svr</mi> <mi mathvariant="normal">s</mi> </msub> </semantics></math>.</p> ">
Figure 3
<p>From <span class="html-italic">n</span> servers, select <span class="html-italic">t</span> servers, and calculate the weighted average of the selected <span class="html-italic">t</span> servers.</p> ">
Figure 4
<p>All clients update the model after receiving <span class="html-italic">w</span> from the <span class="html-italic">t</span> servers.</p> ">
Figure 5
<p>Process diagram for providing models of different accuracy to clients of varying levels.</p> ">
Figure 6
<p>Comparison of the accuracy of our proposed method with FedAvg [<a href="#B2-applsci-14-08959" class="html-bibr">2</a>] and FedShare [<a href="#B8-applsci-14-08959" class="html-bibr">8</a>] using the IID MNIST [<a href="#B29-applsci-14-08959" class="html-bibr">29</a>] dataset with 5 and 10 servers and varying numbers of clients.</p> ">
Figure 7
<p>Comparison of the accuracy of our proposed method with FedAvg [<a href="#B2-applsci-14-08959" class="html-bibr">2</a>] and FedShare [<a href="#B8-applsci-14-08959" class="html-bibr">8</a>] using the IID Fashion-MNIST [<a href="#B30-applsci-14-08959" class="html-bibr">30</a>] dataset with 5 and 10 servers and varying numbers of clients.</p> ">
Figure 8
<p>Comparison of the accuracy of our proposed method with FedAvg [<a href="#B2-applsci-14-08959" class="html-bibr">2</a>] and FedShare [<a href="#B8-applsci-14-08959" class="html-bibr">8</a>] using the Non-IID MNIST [<a href="#B29-applsci-14-08959" class="html-bibr">29</a>] and Non-IID Fashion-MNIST [<a href="#B30-applsci-14-08959" class="html-bibr">30</a>] datasets with 5 servers and varying numbers of clients.</p> ">
Figure 9
<p>Comparison of the average execution time using the IID MNIST [<a href="#B29-applsci-14-08959" class="html-bibr">29</a>] and IID FashionMNIST [<a href="#B30-applsci-14-08959" class="html-bibr">30</a>] datasets for different numbers of clients and servers with FedAvg [<a href="#B2-applsci-14-08959" class="html-bibr">2</a>], FedShare [<a href="#B8-applsci-14-08959" class="html-bibr">8</a>], and our method.</p> ">
Figure 10
<p>Comparison of the average execution time using the Non-IID MNIST [<a href="#B29-applsci-14-08959" class="html-bibr">29</a>] and Non-IID FashionMNIST [<a href="#B30-applsci-14-08959" class="html-bibr">30</a>] datasets for different numbers of clients and servers with FedAvg [<a href="#B2-applsci-14-08959" class="html-bibr">2</a>], FedShare [<a href="#B8-applsci-14-08959" class="html-bibr">8</a>], and our method.</p> ">
Figure 11
<p>The probability that the mechanism fails due to the number of damaged servers for 4-out-of-16 and 5-out-of-25 situations.</p> ">
Figure 12
<p>The probability that an attacker successfully reconstructs the gradient by attacking <span class="html-italic">d</span> servers for the 4-out-of-16 and 5-out-of-25 situations.</p> ">
Figure 13
<p>Box plot of gradient model accuracy at different magnifications (50 times).</p> ">
Versions Notes

Abstract

:
Federated learning is a decentralized privacy-preserving mechanism that allows multiple clients to collaborate without exchanging their datasets. Instead, they jointly train a model by uploading their own gradients. However, recent research has shown that attackers can use clients’ gradients to reconstruct the original training data, compromising the security of federated learning. Thus, there has been an increasing number of studies aiming to protect gradients using different techniques. One common technique is secret sharing. However, it has been shown in previous research that when using secret sharing to protect gradient privacy, the original gradient cannot be reconstructed when one share is lost or a server is damaged, causing federated learning to be interrupted. In this paper, we propose an approach that involves using additive secret sharing for federated learning gradient aggregation, making it difficult for attackers to easily access clients’ original gradients. Additionally, our proposed method ensures that any server damage or loss of gradient shares are unlikely to impact the federated learning operation, within a certain probability. We also added a membership level system, allowing members of varying levels to ultimately obtain models with different accuracy levels.

1. Introduction

In recent years, machine learning has been increasingly used in various fields and has become an indispensable part of our daily lives. In the field of machine learning, two key roles are data owners and developers. As developers aim to improve models, they often need large amounts of diverse data. To achieve this goal, they sometimes ignore data privacy, aggregating datasets from multiple data owners for training, sharing these data with each other and even third parties. However, this does not mean that all customers also trust or are willing to hand over their information to third parties. Therefore, how data owners can jointly train models without sharing private data with third parties has become an important issue. To address privacy issues related to data exchange, Google proposed federated learning (FL) [1,2] as a solution in 2016. Federated learning is a decentralized form of distributed learning where users locally train models using their own data. After completing this training, they upload their respective model parameters to a central aggregator. The central aggregator integrates these parameters and then returns the aggregated model to all users. Through federated learning, users can collectively train a model without having to hand over their individual data.
However, recent studies show that gradients contain sensitive private information from the original data. Thus, attackers can potentially exploit gradients to compromise individuals’ privacy [3,4,5,6]. To address these issues, many studies have employed techniques such as secure multiparty computation, homomorphic encryption, and differential privacy to prevent privacy leaks. However, because homomorphic encryption increases the operational costs and differential privacy adds noise to the model that impacts on the overall accuracy, we have chosen additive secret sharing to ensure that gradients cannot be easily obtained by attackers. In the past research on using additive secret sharing to protect model parameters [7,8], the methods are n-out-of-n. We considered the real-world risk of server failure and devised a solution that can aggregate the gradients of all clients by selecting t servers from n servers, given certain prerequisites. This t-out-of-n concept not only ensures the security of the gradients but also allows for a certain probability of server failure without affecting overall operations.
In addition to ensuring the availability of the mechanism, we also propose a quality of service (QoS)-enabled federated learning mechanism. We want the trained models to provide different accuracy levels for users according to their attributes. The concept of providing a less accurate model may seem counterintuitive but is not uncommon in the real world. For example, in 1983, the United States opened the Global Positioning System (GPS) for civilian use. However, for reasons related to national security, the United States used Selective Availability (SA) to add errors to civilian GPS, reducing its accuracy, while the U.S. military and its allies were able to use highly accurate GPS systems. It was not until May 2000 that the United States announced it would stop using SA and thus allow civilian agencies to obtain accurate GPS systems.
We incorporate the concept of SA into our proposed distributed federated learning scheme to provide different service level models to various customers, applying a membership hierarchy. By modifying gradients on each server, which can affect the model parameters, it is possible to derive various models with different accuracies. Finally, each client is provided with a model of different accuracy depending on their service level.
The contributions of our work are summarized as follows:
  • We construct a threshold additive secret sharing scheme and propose a reliable distributed federating learning system. The model can be aggregated even when some servers are offline or not functioning.
  • We apply the QoS concept to federated learning to provide models of differing accuracy levels to users based on their QoS level.
  • We implement the proposed federated learning system and evaluate it using two real datasets. We also analyze the probability of system failure and the successful attack rate in our t-out-of-n secret sharing scheme.
The organization of this paper is as follows: In Section 2, we delve into the relevant literature concerning federated learning. In Section 3, we discuss the relevant technologies. In Section 4, we introduce the method we propose. In Section 5, we introduce our proposed quality of service framework for federated learning. In Section 6, we present the experimental evaluation. Finally, in Section 7, we provide the conclusions and discuss future work.

2. Related Work

In this section, we introduce federated learning, the base technique of our scheme. We first introduce some federated algorithms and then some examples of related gradient leakage attacks on federated learning. Finally, we introduce some research works about protecting privacy in federated learning.

2.1. Federated Learning

FedAvg [2] is the first proposed federated learning algorithm, which focuses on participants’ data volume. The server selects K clients and sends the original model W t to each client. Clients then train the model on their respective datasets and return the updated parameters W i + 1 i and data volume n i to the server. The server calculates a weighted average W t + 1 of the parameters based on the data volume of each client to generate a new model W t + 1 k = 1 K n k n W t + 1 k . Finally, the server sends the new model back to the clients. The above steps are repeated until the specified number of iterations is reached. In recent years, there has been an increasing amount of research on extending federated learning from a single server to multiple servers. Han et al. [9] proposed deploying multiple edge servers to accelerate training in federated learning. Qu et al. [10] designed a multi-server federated learning architecture, MS-FedAvg, based on FedAvg, where the clients update the model based on the servers in the region.
However, recent studies have shown that gradients contain sensitive private information from the original data. Zhu et al. [3] proposed an attack termed Deep Leakage from Gradients (DLG). In DLG, the attacker provides a dummy dataset x and dummy labels y as input to the differentiable machine learning model F ( x , W ) along with training parameter weights W. After training, the attacker obtains dummy gradients specific to ( x , y ) . Subsequently, the attacker provides a dummy dataset x and dummy labels y as the input to the differentiable machine learning model F ( x , W ) along with the training parameter weights W. After training, the attacker obtains dummy gradients specific to ( x , y ) . The attacker then calculates the difference between the real gradients W and the dummy gradients W , which gradually converge over time:
x * , y * = arg min x , y W W 2 = arg min x , y ( F ( x , W ) , y ) W W 2
As the gap between the dummy gradients and real gradients narrows, the dummy dataset x also aligns more closely with the real dataset x. Therefore, even if clients only upload gradients to the central server, attackers can still reverse engineer the training data via the gradients. Hence, it can be concluded that federated learning is not entirely secure. Wei et al. [5] presented an algorithm that utilizes attack seed-generated gradients to approximate the real gradients and reconstruct the local training data. Geiping et al. [6] demonstrated that it is possible to reconstruct actual input data from gradient information, allowing every fully connected layer to be reconstructed. Zhao et al. proposed iDLG [4] to extract the ground truth label from the gradients that clients share with the server. It can be seen that transmitting only the gradient parameters between the server and the client does not ensure security.

2.2. Privacy-Preserving Federated Learning

Many solutions for preventing information leakage attacks have been proposed in recent years, to ensure that third parties cannot reconstruct the real training data. Common techniques used to research gradient protection include homomorphic encryption, differential privacy [11], and secure multiparty computation (SMC) [12].
Homomorphic encryption allows operations to be performed directly on ciphertext, yielding results in ciphertext form. Therefore, encrypted gradients can be aggregated over the ciphertext space, ensuring that gradients are not accessible to third parties. Park and Lim [13] proposed the privacy-preserving federated learning algorithm, which enables the central server to aggregate encrypted model parameters. Fang and Qian [14] utilized an improved version of the Paillier encryption scheme to protect gradient information. Zhang et al. [15] encoded a batch of quantized gradients into a long integer and used homomorphic encryption for protection. Ma et al. [16] adopted the concept of a multi-key homomorphic encryption protocol to encrypt gradients before aggregation on the server. Ku et al. [17] designed a system in which, after the clients complete training, the fog nodes perform the first encryption, followed by a second encryption. While homomorphic encryption enables the aggregation of encrypted gradients, thereby preventing gradient leakage, it incurs substantial computational and storage costs.
Differential privacy [11] was initially developed for protecting the privacy of databases. By slightly modifying the query result, the attacker cannot completely reconstruct the database content. Chuanxin et al. [18] used Gaussian differential privacy to build the federated learning scheme Noisy-FL, where noise is added before each round in which the server performs gradient averaging. Wei et al. [19] implemented the addition of artificial noise to the client’s parameters before aggregating them at the central server. Triastcyn and Faltings [20] applied Bayesian differential privacy to federated learning. LDP-FL [21] allows clients to perturb model weights with noise, divide and shuffle them, and then upload the weights to the central server for aggregation. While differential privacy is effective for protecting privacy, adding noise can impact the overall accuracy of a model. Balancing accuracy and privacy has always been a major concern in the context of differential privacy.
SMC [12] is a technique in which a task is divided into subtasks, which are distributed to multiple parties. Once all subtasks are completed, their results can be merged to derive the result for the original task. Note that there is no leakage of the original task result when processing subtasks. There are many studies on applying SMC in federated learning. In federated learning with one cloud server and n clients, Duan et al. [22] proposed that each client uses secret sharing to split gradients into n shares and send their shares to other clients, and clients then send the aggregated parameters to the server. In the Chain-PPFL scheme [23], a client generates a new token based on the trained model parameters and passes it to the next client, and the last client sends the token to the server, reflecting a chained-communication mechanism architecture. In federated learning with multiple servers and n clients, HybridAlpha [24] uses functional encryption for SMC to protect model gradients. More et al. [7] introduced the SCOTCH framework. In a scenario with k servers and n clients, when the clients finish local training, SCOTCH divides each client’s model into k shares via additive secret sharing, and each client sends the i-th share to the corresponding i-th server. The servers then sum the received shares and divide the total by the number of servers. Finally, the servers return the aggregated share to each client. Khojir et al. [8] extended the SCOTCH [7] framework and named it Fedshare [8]. The authors contend that each client has a different amount of data, so it is important that different clients are assigned different weights when aggregating parameters on the servers. Similar to SCOTCH [7], Fedshare [8] deploys k servers and a main server, and adopts additive secret sharing to divide the parameters of each client’s model into k shares. All clients send their respective shares and data volumes to the servers. When the servers receive information on the shares and data volume of each client, they can sum up the data volumes and compute the weighted average. Finally, the servers send the weighted average information to the main server for summing, and they can obtain the new model. Both SCOTCH [7] and Fedshare [8] encounter a serious problem: if even a single server is damaged, it is impossible to derive the desired model, as the model can be merged only when all servers provide their shares. Therefore, in this paper, we propose ( t , n ) -threshold secret sharing federated learning, which enables the system to tolerate some degree of server failure.

3. Preliminaries

In this section, we first introduce a simple 2-out-of-2 additive secret sharing scheme, followed by its extension to a 2-out-of-n scheme. To construct a t-out-of-n secret sharing scheme, we use a technique similar to the extension approach proposed in a summary of a lecture on secret sharing [25].

3.1. 2-Out-of-2 Additive Secret Sharing

Additive secret sharing is implemented within a field F . A secret S F is randomly divided into several shares. Additive secret sharing consists of the two algorithms Share and Reconstruct, for simple 2-out-of-2 secret sharing:
Share 2 2 :
On input secret S,
  • select s 0 F uniformly at random.
  • set s 1 = S s 0 .
  • output ( s 0 , s 1 ) .
Reconstruct 2 2 :
On input ( s 0 , s 1 ) ,
  • Output s 0 + s 1 .
The above is a 2-out-of-2 additive secret sharing scheme. In this scheme, you must hold both s 0 and s 1 shares to reconstruct the secret S. We can easily extend this into n-out-of-n additive secret sharing scheme. Similarly, all n shares must be available to reconstruct the secret.

3.2. Build 2-Out-of-n Additive Secret Sharing from 2-Out-of-2 Additive Secret Sharing

In a 2-out-of-n scheme, the secret S is split into n shares. The secret S can be reconstructed as long as any 2 of the n shares are obtained. The simple 2-out-of-n secret sharing scheme is as follows:
Share 2 n :
On input secret S,
  • For i = 0 to log ( n 1 )
    -
    Execute Share 2 2 ( S ) ( s i , 0 , s i , 1 ) .
  • For j = 0 to n 1
    -
    Set G j = ( j , s 1 , j 1 , , s log n , j log n ) , where binary representation of j is j 1 j log n . Here we use q to represent log n , so G j = ( j , s 1 , j 1 , , s q , j q ) , and j is j 1 j q
  • output ( G 0 , , G n 1 ) .
Reconstruct 2 n :
On input ( G i , G j ) ,
  • Convert i and j to their binary representations, i = i 1 i q and j = j 1 j q .
  • Find a digit position k where i k j k , the corresponding s k , i k = s k , 0 , s k , j k = s k , 1 or the other way around, s k , i k = s k , 1 , s k , j k = s k , 0 .
  • Execute Reconstruct 2 2 ( s k , 0 , s k , 1 ) and output S, where S = s k , 0 + s k , 1 .

4. Federated Learning with Threshold Secure Aggregation

4.1. Overview

We consider n honest-but-curious servers (numbered from Svr 0 to Svr n 1 ), z clients (numbered from C 0 to C z 1 ), and a level server. Each client z i has their own dataset with a size of l i , and the total data size across all clients is L. Figure 1 is the proposed architecture. Prior to the beginning of each round of federated learning, the clients collectively decide a number t, where 2 n and 1 < t n , which means that t servers are selected from n servers for aggregation. The numbering of each server is then converted from decimal to t-base according to the clients’ choice. After the conversion is completed, the length of servers’ number is q bits, and the number of the Svr i is i 0 i q 1 .
After each round of local training, the clients use secret sharing to split their model’s gradients, then send the shares and dataset sizes to each server based on predefined rules. When all servers receive the shares and dataset sizes, t servers that meet the conditions are selected to aggregate the shares they received and return the results to each client. Finally, clients can update their models and start the next round of training by summing the gradients received from the t servers.
We assume a passive and honest-but-curious secure threat model, where servers comply with all protocol specifications but may attempt to infer private information based on inputs. Moreover, attackers will try to obtain t servers that can recover the information to access private data.

4.2. Threshold Additive Secret Sharing

In this section, we describe how n-out-of-n additive secret sharing can be extended into t-out-of-n additive secret sharing.

4.2.1. n-Out-of-n Additive Secret Sharing

We simply extend 2-out-of-2 additive secret sharing to n-out-of-n additive secret sharing:
Share n n :
On input secret S,
  • For i = 0 to n 1
    -
    select s i F uniformly at random.
  • Set s n 1 = S i = 0 n 2 s i .
  • Output ( s 0 , , s n 1 ) .
Reconstruct n n :
On input ( s 0 , , s n 1 ) ,
  • Output S = i = 0 n 1 s i .
Theorem 1.
The scheme ( Share n n , Reconstruct n n ) is a correct and secure additive secret sharing scheme.
Proof. 
Correctness follows immediately from ( Share n n , Reconstruct n n ) :
Pr Share ( S ) ( s 0 , , s n 1 ) [ Reconstruct ( s 0 , , s n 1 ) = S ] = Pr Share ( S ) ( s 0 , , s n 1 ) [ s 0 + + s n 1 = S ] = 1
Security follows immediately from ( Share n n , Reconstruct n n ) : let any S , S F , and consider one share. If it is the first share, and it is selected uniformly that α F ,
Pr Share ( S ) ( s 0 , , s n 1 ) [ s 0 = α ] = 1 | F | = Pr Share ( S ) ( s 0 , , s n 1 ) [ s 0 = α ] .
If it is the last share, and it is selected uniformly that α F ,
Pr Share ( S ) ( s 0 , , s n 1 ) [ s n 1 = α ] = Pr Share ( S ) ( s 0 , , s n 1 ) [ S ( s 0 + s 1 + + s n 2 ) = α ] = Pr Share ( S ) ( s 0 , , s n 1 ) [ ( s 0 + s 1 + + s n 2 ) = S α ] = 1 | F |
Similarly,
Pr Share ( S ) ( s 0 , , s n 1 ) [ s n 1 = α ] = Pr Share ( S ) ( s 0 , , s n 1 ) [ ( s 0 + s 1 + + s n 2 ) = S α ] = 1 | F |
and both probabilities are equal, then ( Share n n , Reconstruct n n ) is a secure scheme.
   □

4.2.2. Build t-out-of-n Additive Secret Sharing from t-out-of-t Additive Secret Sharing

t-out-of-n secret sharing was proposed by Shamir [26]. While it is a correct and secure scheme, it does not have additive characteristics. Therefore, we use the concept from [25] to extend the additive secret sharing scheme of 2-out-of-n from 2-out-of-2 to t-out-of-n via t-out-of-t. Here, we use q to represent log t ( n 1 ) ,
Share t n :
On input secret S,
  • For i = 0 to q
    -
    Execute Share t t ( S ) ( s i , 0 , , s i , t 1 ) .
  • For j = 0 to n 1
    -
    Set G j = ( s 0 , j 0 , s 1 , j 1 , , s q , j q ) , where t-base representation of j is j 0 j q .
  • Output ( G 0 , , G n 1 ) .
Reconstruct t n :
On input t Gs selected from ( G 0 , , G n 1 ) ,
  • Convert the index i of the selected t Gs into t-base representation. For example, if G 1 is selected, then i = 1 and i 1 0 1 q .
  • Find a digit position k where the k-th digit of the t-base representation of the indices of the t selected Gs is different for each index, and use k to find corresponding s k , i k . For example, if G 1 is selected, then i = 1 , s k , 1 k is the share we need. Finally, we obtain t shares from t selected Gs, and they can conveniently form ( s k , 0 , , s k , t 1 ) .
  • Execute Reconstruct t t ( s k , 0 , , s k , t 1 ) and output S, where S = s k , 0 + + s k , t 1 .
Theorem 2. 
If ( Share t t , Reconstruct t t ) is a correct and secure n-out-of-n additive secret sharing scheme, then the scheme ( Share t n , Reconstruct t n ) is a correct and secure t-out-of-n additive secret sharing scheme.
Proof. 
Correctness: The indices of all t selected Gs are different, and t indices 0 , , n 1 ,
Pr Share t n ( S ) ( G 0 , , G n 1 ) [ Reconstruct t n ( G s ) = S ] = Pr Share t t ( S ) ( s k , 0 , , s k , t 1 ) [ Reconstruct t t ( s k , 0 , , s k , t 1 ) = S ] = 1
where, in the first equation, among the indices of the selected t G ’s, at least one must have a k-th digit that differs from the others. The second equation is ensured by the correctness of the ( Share n n , Reconstruct n n ) additive secret sharing scheme.
Security: For Share t n ( S ) ( G 0 , , G n 1 ) , we need to choose t Gs, and these tGs must have a different k-th digit. Through t Gs we can obtain ( s k , 0 , , s k , n 1 ) . Similarly, Share t n ( S ) ( G 0 , , G n 1 ) also requires the selection of t G s with a different k-th digit. Similarly to Share t t ( S ) , if it is the first share, and α F ,
Pr Share t n ( S ) ( G 0 , , G n 1 ) [ s k , 0 = α ] = 1 | F | = Pr Share t n ( S ) ( G 0 , , G n 1 ) [ s k , 0 = α ] .
If it is last share, then
Pr Share t n ( S ) ( G 0 , , G n 1 ) [ s k , n 1 = α ] = Pr Share t n ( S ) ( G 0 , , G n 1 ) [ S ( s k , 0 + + s k , n 2 ) = α ] = Pr Share t n ( S ) ( G 0 , , G n 1 ) [ s k , 0 + + s k , n 2 = S α ] = 1 | F |
Similarly,
Pr Share t n ( S ) ( G 0 , , G n 1 ) [ s k , n 1 = α ] = Pr Share t n ( S ) ( G 0 , , G n 1 ) [ s k , 0 + + s k , n 2 = S α ] = 1 | F |
and both probabilities are equal, so ( Share t n , Reconstruct t n ) is a secure scheme.    □
The t-out-of-n additive secret sharing scheme is established under the condition that t 2 . Additionally, there is a probability that the selected t Ms cannot find the k-th digit, which differ from each other. In such cases, t Ms must be re-selected. The probability of this occurrence is detailed in Section 6.2.

4.3. Proposed Approach

Our proposed architecture can be mainly divided into three phases: local training, server aggregation, and model update. Initially, the clients decide on the number t of servers. Table 1 lists all the symbols we use in the model and their respective meanings.
Local training. C i trains a model and obtains new model gradients M i . When the training reaches the specified epoch, the clients execute Share t n ( M ) to generate q groups, with each group containing t shares, as shown in Algorithm 1 and Figure 2.
Algorithm 1 Splitting gradients after local training
Input: all servers’ number length q, M i is the gradients obtained after training for C i , t is the number of servers that the clients choose to eventually aggregate.
  for j 0 to q 2  do
    for  k 0 to t 1  do
       m i , j , k = a random integer in the range F
    end for
     m i , j , t 1 = M i k = 0 t 2 m i , j , k
  end for
For example, C i receives the new gradients M i , and executes Share t n ( M i ) to obtain
( m i , 0 , 0 , m i , 0 , 1 , , m i , 0 , t 1 ) , ( m i , 1 , 0 , m i , 1 , 1 , , m i , 1 , t 1 ) , , ( m i , q 1 , 0 , m i , q 1 , 1 , , m i , q 1 , t 1 ) ,
among which
M i = m i , 0 , 0 + m i , 0 , 1 + + m i , 0 , t 1 = m i , 1 , 0 + m i , 1 , 1 + + m i , 1 , t 1 = m i , q 1 , 0 + m i , q 1 , 1 + + m i , q 1 , t 1
After dividing the gradients into multiple groups of shares m i , j , k , the clients need to send these shares to the servers. Next, we introduce the rules for which share the clients send to which server, as shown in Algorithm 2 and Figure 2. The clients decide which server to send the share m i , j , k to based on the second parameter j and third parameter k. When C i wants to send shares to Svr s , C i sends m i , 0 , s 0 , , m i , q 1 , s q 1 , a total of q shares, and data volumes l i to Svr s .
Algorithm 2 Sending shares and data volume to servers
Input:  ( m i , 0 , 0 , , m i , 0 , t 1 ) , , ( m i , q 1 , 0 , , m i , q 1 , t 1 ) is the q group of gradient shares of C i , s 0 s q 1 is the identifier of Svr s , l i is the data volume of C i .
  for  j 0 to q 1  do
     C i sends m i , j , s j to Svr s
  end for
   C i sends l i to Svr s
This is in contrast to Fedshare [8], where each client sends one share and one data size value to each server. In our proposed architecture, each server receives q shares and a data size value from each client. Thus, each server will have a total of q z shares.
Server aggregation. After the servers have received the gradient shares and the data volume from each client, we need to select t servers (Algorithm 3) and aggregate the shares of t servers (Algorithm 4). As shown in Figure 3, we generate t numbers, each with q empty bits, and randomly choose a number h where 0 h q 1 . Then, we randomly fill in the all bits with 0 to t 1 , and all number’s h-th digit is filled in 0 to t 1 , respectively. When all the bits are filled in and we collect t numbers, we need to confirm whether these t numbers can be found in the set of all server numbers. If a number is not found among the server numbers, we need to search for the t numbers again.
Algorithm 3 Select servers
Input: number of servers n, all servers’ number length q, S = { 0 0 0 q 1 , , n 1 0 n 1 q 1 } is the set of the number of each server, t is the number of servers that the clients choose to eventually aggregate.
Output: n u m , h
   n u m = [ ]
   f i x n u m = [ 0 , , t 1 ]
  while len(num) < t  do
       h random ( 0 , q 1 )
      for  r 0 to t do
         Generate a random q-digit number composed of the digits 0 to t 1
           e Replace the h-th digit with fixnum [ r ]
          if e not in nume in S then
             num [ r ] e
          end if
      end for
  end while
  return num, h
Algorithm 4 Server aggregation
Input: num is the identifier of the selected t server, number of clients a, m i , j , k is the k-th share of the j-th secret shared by C i , l 0 , , l z 1 is each client’s data volume, h is the different bit of the selected server.
  for r in num do
       L = k = 0 z 1 l k
      for  i 0 to z do
          Server r calculates weighted average w r = k = 0 z 1 m k , h , r h l k L
      end for
      return  w r
  end for
The servers whose numbers correspond to the selected t numbers are the servers to be aggregated. Those servers sum up the data volume l i of all clients to obtain the overall data volume L. Subsequently, those servers calculate the weighted average w as shown in Equation (2) (Algorithm 4).
w = k = 0 z 1 m k , h , i h l k L
For the correctness of aggregating t selected servers from n servers, please refer to Theorem 2.
Model update. After t servers finish calculating w, they send w to all clients. When the clients receive w from t servers, they sum up w to obtain the new model gradients, and the clients can then update the model to start a new round of training, as shown in Algorithm 5 and Figure 4.
Algorithm 5 Model update
Input: number of clients z, t is the number of servers that the clients choose to eventually aggregate, w is the weighted average of the gradient shares computed by the servers.
  for r in n u m  do Broadcast w r to each client
  end for
  Clients sum up all the weighted averages w they have received and use new gradients to update the model.

5. Quality of Service in Federated Learning

In this section, we propose a method for obtaining models with different accuracy levels by adjusting the gradient after completing federated learning training. This method incorporates the concept of SA and can preserve the security of federated learning. Customers can choose according to their own needs, and we can provide correspondingly adjusted models based on each customer’s membership level.
During the model training process, the model parameters are constantly updated based on the gradients to better fit the training data. Thus, we leverage the fact that the parameters are changed according to the gradients, where we aim to influence the model accuracy by changing the gradients.
We have set the membership level for clients to choose from. The highest level corresponds to receiving an unadjusted model, and the lower levels to a model with larger gradient adjustments. Because the gradients affect the model parameters, members of a lower level may receive a model with lower accuracy. A simple process diagram is shown in Figure 5. The entire process can be divided into five steps:
Step 1:
Before federated learning begins, clients choose their level according to their needs and pay the corresponding amount. All client choices are sent to each server in the first round of federated learning. The transmission process, shown in Algorithm 6, is very similar to the previous one, Algorithm 2, except that the clients send their selected level to the servers. Level information is only transmitted after the first round of local training in federated learning, and subsequent rounds are carried out according to Algorithm 2.
Step 2:
To allow clients of different levels to receive with different degrees of adjustment, we formulate the Level Range Table for the level server. The table lists the random number ranges for each level. In Step 4, the generated random numbers are used to adjust the gradients.
Step 3:
The level server generates a Gradient Adjustment Table. This table lists p random numbers ( r 1 , r 2 , . . . , r p ) in the corresponding range for each level according to the Level Range Table (Algorithm 7).
Step 4:
The level server sends the Gradient Adjustment Table to t selected servers. These servers adjust the model gradients based on the level indicated by the client in Step 1. They multiply the i-th layer of the model by the corresponding level r i in the Gradient Adjustment Table (Algorithm 8).
Step 5:
Like the model update in Section 4.3 (Algorithm 5), t servers send the adjusted w to each client. After receiving t gradient shares from t servers, each client sums the shares to obtain new gradients corresponding to their level and then uses these new gradients to update the model.
Algorithm 6 Sending information to servers
Input:  ( m i , 0 , 0 , , m i , 0 , t 1 ) , , ( m i , q 1 , 0 , , m i , q 1 , t 1 ) is the q group of gradient shares of C i , s 0 s q 1 is the identifier of Svr s , l i is the data volume of C i , l v i is the level chosen by C i .
  for  j 0 to q 1  do
       C i sends m i , j , s j to Svr s
  end for
   C i sends l i and l v i to Svr s
Algorithm 7 Generate Gradient Adjustment Table
Input: Level Range Table, model layer number p.
Output: Gradient Adjustment Table T G .
  for each level do
      Generate p random number r 1 , , r p based on the Level Range Table.
  end for
Algorithm 8 Gradient adjustment
Input: Gradient Adjustment Table T G , the w k is the weighted average gradient shares by the Svr k , model layer number p, l v i is the level selected by C i .
  for  j 0 to p do
      The gradient of the j-th layer in w k is adjusted by multiplying with r j of the l v i in T G .
  end for
After the above five steps, each client receives the gradient-updated model of the corresponding level. By ensuring that the model with a larger gradient adjustment is more likely to have lower accuracy, customers who pay a higher amount can enjoy a model with higher accuracy.

6. Evaluation

In this section, we divide the experiment into two parts. The first part is a comparison of our proposed method with FedAvg [2] and Fedshare [8], and experiments using both IID data and non-IID data are conducted to evaluate its effectiveness. In the second part, we present our proposed federated learning scenario. We divide members into seven levels and detail the experimental results of member grading.

6.1. Learning Performance and Computational Cost

6.1.1. Experimental Environment

The following experiments are all performed on Google Colab Pro [27]. We use Python 3.10.12 and PyTorch 2.3.0 [28] to build a CNN model with si layers. We used the MNIST [29] and Fashion-MNIST [30] datasets for our experiments. Throughout the experiments, we simulate the complete federated learning process, including local training, server aggregation, and model updates. In the client’s local training, we set the training for 5 epochs, with a batch size of 32 and learning rate of 0.1. Each complete experiment operation comprises five rounds. In the experiment to evaluate effectiveness, we aimed to compare FedAvg [2], Fedshare [8], and our proposed method. FedAvg [2] is a classic federated learning approach, and Fedshare [8] uses n-out-of-n secret sharing in federated learning for privacy. We chose these two methods for comparison, to demonstrate that our method maintains accuracy while protecting the privacy of federated learning.

6.1.2. Accuracy Comparison for Independent and Identically Distributed Data (IID)

Both the MNIST [29] and Fashion-MNIST [30] datasets contain 60,000 samples for training. In this experiment, we randomly and equally distributed the training data to all clients to ensure IID. We compared 2, 5, 10, and 20 clients. Regarding server numbers, because FedAvg [2] has only one server in its design, we mainly compared Fedshare [8] and our method. We performed 2 experiments with a total number of servers being 5 and 10. Additionally, the method we proposed is t-out-of-n, and we set t to 2 for the experiments.
The experimental results are shown in Figure 6 and Figure 7. We included the accuracy rate from the most basic machine learning model as a baseline for comparison. It can be seen from the four figures that regardless of whether it is the MNIST [29] or Fashion-MNIST [30] dataset, the accuracy decreases slightly as the number of clients increases. However, the number of servers does not affect the accuracy of the model after training. The accuracy of the three federated learning methods differs from that of the basic machine learning model. This may be related to the number of local training iterations and the dataset distribution. Figure 7 shows that there is a slight, but insignificant, difference in accuracy between 5 and 10 servers. In addition, our proposed method has basically the same the accuracy as FedAvg [2]. This shows that our method can protect the gradient without affecting the accuracy of federated learning.

6.1.3. Accuracy Comparison for Non-Independent and Non-Identically Distributed Data (Non-IID)

In the real world, the data held by each customer vary due to differences in environment and habits, resulting in differences in the type or amount of data held by customers. To simulate this real-world situation, we conducted non-IID experiments. Specifically, we simulated non-IID quantity skew, which refers to the variation in the amount of data owned by each client. In the non-IID experiment, we first shuffle the dataset and then randomly assign different amounts of data to each client. Each client uses the assigned data to start local training.
As with the IID experiment, we used two datasets, MNIST [29] and Fashion-MNIST [30], and set the number of servers for FedAvg [2] to one, and the number of servers for Fedshare [8] and our method to five. We also conducted four experiments with 2, 5, 10, and 20 clients for comparison. Figure 8a shows the experimental results for the MNIST [29] dataset, and Figure 8b shows the experimental results for the Fashion-MNIST [30] dataset. The results show that the model has lower accuracy when trained with non-IID data than with IID data, due to the uneven distribution of the former. Additionally, Figure 8a,b clearly demonstrate that because we incorporate the same weighted mechanism as FedAvg [2] and Fedshare [8] in the non-IID experiment, the accuracy of the three methods is very similar.

6.1.4. Computational Cost for IID and Non-IID Data

Figure 9a,b display the execution times of IID data federated learning using MNIST [29] and Fashion-MNIST [30], respectively. These times are mainly calculated from local training on the client and include the times for splitting gradients, sending them to the server, server aggregation, sending back to the client, and client updates. This represents the time for the complete process of federated learning. It can be clearly seen from the two figures that the larger the number of servers and clients, the longer the execution time. FedAvg [2] takes the least amount of time among the three methods because the steps for gradient splitting and distributing to multiple servers are omitted. The execution time of our proposed method is the longest of the three methods, because unlike Fedshare [8], we need to perform secret sharing multiple times based on the number of servers, and not just once for the model parameters of each client. Additionally, we need to select a combination of t servers that meet the requirements, which further increases our execution time, thus making it the slowest of the three methods.
Figure 10a,b display the execution times of non-IID data federated learning using MNIST [29] and Fashion-MNIST [30], respectively. There is little difference in the execution time compared to the IID data experiments (Figure 9). FedAvg has the shortest execution time because the gradient-splitting step is omitted. In contrast, our method requires the longest time due to more frequent gradient-splitting steps and additional server selection steps. In addition, similarly to the IID experiment, the execution time for aggregate updates of the three methods increases as the number of clients increases.

6.2. The Recovery Probability for Server Errors

In the t-out-of-n aggregation method for federated learning that we propose, the selected t servers cannot be chosen at random and need to comply with the rules specified in Section 4.2.2 for reconstructing the original gradient. Therefore, even if several servers are damaged, federated learning can still operate normally within a certain probability. However, if the server is damaged beyond this threshold, gradient reconstruction may fail, causing federated learning to be halted. Here, we consider the situation of selecting t servers from n servers, where n = t 2 . Each server is numbered in base t and identified by a two-digit number, which means that each digit can only be a number from 0 to t 1 . Regardless of whether the first or second digit is used to group servers with the same number, there will be t groups, with t servers for each. For example, using the first digit to group, the group with the number 0 will contain the servers 00 , 01 , , 0 t 1 .
We select t servers. As long as one server is selected from each group, the mechanism can operate normally regardless of whether it is based on the first or second digit. On the other hand, the entire mechanism will fail if all servers in at least one group are damaged based on both the first and second numbers. Under the condition of n = t 2 , let F be the system failure event and b the number of damaged servers. If b < 2 t 1 , it will definitely not affect the operation of federated learning, P ( F ) = 0 . If b t × ( t 1 ) + 1 , the mechanism will definitely fail, P ( F ) = 1 . If 2 t 1 b < t × ( t 1 ) + 1 , there is a possibility that the mechanism cannot operate.
In the experiment, we listed all the combinations from b = 1 to n and assessed each individually according to whether, when including both a first and second digit, they will fail to find t servers with different numbers. Figure 11 shows the probability that the mechanism will fail due to different numbers of damaged servers in the 4-out-of-16 and 5-out-of-25 situation. It can be seen that, when b < 2 t 1 , the mechanism is not affected. When 2 t 1 b < t × ( t 1 ) + 1 , the more servers that are damaged, the higher the probability of failure. After b t × ( t 1 ) + 1 , the mechanism will definitely fail.
Through probability analysis, it can be confirmed that our proposed t-out-of-n additive secret sharing method exhibits a certain level of fault tolerance compared to traditional additive secret sharing, thereby rendering the overall mechanism more stable.

6.3. The Successful Attack Probability for Compromised Servers

As described above, we only need to select a server from each group to reconstruct the gradient. We assume that the attacker attacks d servers at will. Let F be a successful system attack event. When d < t , the attacker cannot restore any gradient information, P ( F ) = 0 . When d ( t 1 ) 2 + 1 , an attacker can definitely restore the gradient, P ( F ) = 1 . When t d < ( t 1 ) 2 + 1 , there are n d ways for an attacker to select d servers from n servers. Based on assumptions that are the same as above, each group has t servers, and we need to confirm whether the d servers selected by the attacker include any group from either the first or second digit. If so, the attacker can reconstruct the original gradient.
In the experiment, we listed all the combinations from d = 1 to n and assessed each individually regarding consistency with the situation where the first or second digit of t servers is different. Figure 12 shows the probability that the attacker successfully reconstructs the gradient by attacking different numbers of servers in 4-out-of-16 and 5-out-of-25. It can be seen that when d < t , it is impossible for the attacker to restore the gradient information. When t d < ( t 1 ) 2 + 1 , as the number of attacked servers increases, the probability of obtaining the complete gradient also increases. When d ( t 1 ) 2 + 1 , the attacker can definitely obtain the complete gradient information.

6.4. Quality of Federated Learning Experiment

In this experiment, we established a membership system, hoping that by adjusting the gradient, members of different levels may receive models of varying accuracy. In the last round of federated learning, the gradient will be multiplied by a random number within a range set based on the client’s level, and then returned to the client. Clients of the highest level can obtain the original trained model, while the lower the level, the greater the gradient adjustment of the model.
This experiment was performed on Google Colab Pro [27]. We used Python 3.10.12 and PyTorch 2.3.0 [28] to build a CNN model with 9 layers, a learning rate of 0.001, and local training for 5 epochs. The MNIST dataset [29] was used in the experiments. We set seven levels and seven clients, with one client corresponding to one level.
We conducted a total of 50 experiments and averaged the results. The results are presented in Table 2 and Figure 13. From the experimental results, we can see that modifying the gradient does indeed have the potential to reduce the model accuracy. Furthermore, based on the percentage drop in Table 2, it can be observed that the greater the modification factor, the larger the decrease in accuracy. In order to affect the model accuracy by modifying the gradient, it must be adjusted at least 20 times before the decline in accuracy becomes significant. It can be clearly seen from the experimental results in Figure 13 that the greater the adjustment of the gradient, the greater the decrease in the average, median, and lower whisker. However, the adjustment multiple should not be too large, as this could lead to an excessive decrease in model accuracy, as seen with the outliers in the (50,60) range in Figure 13. This experiment demonstrates that it is possible to adjust the gradient so that members of different levels can obtain models of varying accuracy, and clients can make different choices according to their needs.

7. Conclusions and Future Works

In this paper, we propose a distributed federated learning mechanism based on additive t-out-of-n secret sharing that ensures the availability and protection of gradients. Because of the properties of additive secret sharing, neither the server nor the adversary can obtain the complete gradient for each customer. This ensures that the customer’s training data cannot be reconstructed. Additionally, our mechanism guarantees a certain probability of maintaining overall operation even in the event of server damage during federated learning operations. We also conducted experiments to compare the performance of our method with that of FedAvg [2] and FedShare [8], and the experimental results demonstrate that our method does not impact the model accuracy. In addition, we proposed using a membership level system that allows customers to choose membership levels according to their own needs and budgets, and customers are provided with different levels of gradient adjustments based on their membership level.
In future work, we hope to develop new methods such that when customers perform secret sharing, the number of groups generated by shares will not change with the number of servers. This would allow reducing the cost and execution time, as only one group of shares is generated.

Author Contributions

Conceptualization, Y.-T.T. and P.-W.C.; methodology, Y.-T.T. and P.-W.C.; software, Y.-T.T.; validation, Y.-T.T. and P.-W.C.; formal analysis, Y.-T.T.; investigation, Y.-T.T. and P.-W.C.; resources, Y.-T.T. and P.-W.C.; data curation, Y.-T.T.; writing—original draft preparation, Y.-T.T.; writing—review and editing, Y.-T.T., P.-W.C. and C.-T.K.; visualization, Y.-T.T.; supervision, P.-W.C. and C.-T.K.; project administration, P.-W.C.; funding acquisition, P.-W.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Science and Technology Council, Taiwan, under the Grant NSTC 113-2634-F-004-001-MBK.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SASelective Availability
DLGDeep Leakage from Gradients
SMCSecure Multiparty Computation
GPSGlobal Positioning System

References

  1. McMahan, H.B.; Moore, E.; Ramage, D.; y Arcas, B.A. Federated Learning of Deep Networks using Model Averaging. arXiv 2016, arXiv:1602.05629. [Google Scholar]
  2. McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. arXiv 2023, arXiv:1602.05629. [Google Scholar]
  3. Zhu, L.; Liu, Z.; Han, S. Deep leakage from gradients. In Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
  4. Zhao, B.; Mopuri, K.R.; Bilen, H. idlg: Improved deep leakage from gradients. arXiv 2020, arXiv:2001.02610. [Google Scholar]
  5. Wei, W.; Liu, L.; Loper, M.; Chow, K.H.; Gursoy, M.E.; Truex, S.; Wu, Y. A framework for evaluating gradient leakage attacks in federated learning. arXiv 2020, arXiv:2004.10397. [Google Scholar]
  6. Geiping, J.; Bauermeister, H.; Dröge, H.; Moeller, M. Inverting gradients-how easy is it to break privacy in federated learning? Adv. Neural Inf. Process. Syst. 2020, 33, 16937–16947. [Google Scholar]
  7. More, Y.; Ramachandran, P.; Panda, P.; Mondal, A.; Virk, H.; Gupta, D. SCOTCH: An efficient secure computation framework for secure aggregation. arXiv 2022, arXiv:2201.07730. [Google Scholar]
  8. Fazli Khojir, H.; Alhadidi, D.; Rouhani, S.; Mohammed, N. FedShare: Secure aggregation based on additive secret sharing in federated learning. In Proceedings of the 27th International Database Engineered Applications Symposium, Heraklion, Crete, Greece, 5–7 May 2023; pp. 25–33. [Google Scholar]
  9. Han, D.J.; Choi, M.; Park, J.; Moon, J. FedMes: Speeding up federated learning with multiple edge servers. IEEE J. Sel. Areas Commun. 2021, 39, 3870–3885. [Google Scholar] [CrossRef]
  10. Qu, Z.; Li, X.; Xu, J.; Tang, B.; Lu, Z.; Liu, Y. On the convergence of multi-server federated learning with overlapping area. IEEE Trans. Mob. Comput. 2022, 22, 6647–6662. [Google Scholar] [CrossRef]
  11. Dwork, C. Differential privacy. In International Colloquium on Automata, Languages, and Programming; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1–12. [Google Scholar]
  12. Yao, A.C. Protocols for secure computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982), Chicago, IL, USA, 3–5 November 1982; pp. 160–164. [Google Scholar]
  13. Park, J.; Lim, H. Privacy-preserving federated learning using homomorphic encryption. Appl. Sci. 2022, 12, 734. [Google Scholar] [CrossRef]
  14. Fang, H.; Qian, Q. Privacy preserving machine learning with homomorphic encryption and federated learning. Future Internet 2021, 13, 94. [Google Scholar] [CrossRef]
  15. Zhang, C.; Li, S.; Xia, J.; Wang, W.; Yan, F.; Liu, Y. {BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated learning. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 20), Online, 15–17 July 2020; pp. 493–506. [Google Scholar]
  16. Ma, J.; Naas, S.A.; Sigg, S.; Lyu, X. Privacy-preserving federated learning based on multi-key homomorphic encryption. Int. J. Intell. Syst. 2022, 37, 5880–5901. [Google Scholar] [CrossRef]
  17. Ku, H.; Susilo, W.; Zhang, Y.; Liu, W.; Zhang, M. Privacy-preserving federated learning in medical diagnosis with homomorphic re-encryption. Comput. Stand. Interfaces 2022, 80, 103583. [Google Scholar] [CrossRef]
  18. Chuanxin, Z.; Yi, S.; Degang, W. Federated learning with Gaussian differential privacy. In Proceedings of the 2020 2nd International Conference on Robotics, Intelligent Control and Artificial Intelligence, Shanghai, China, 17–19 October 2020; pp. 296–301. [Google Scholar]
  19. Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.; Poor, H.V. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3454–3469. [Google Scholar] [CrossRef]
  20. Triastcyn, A.; Faltings, B. Federated learning with bayesian differential privacy. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 2587–2596. [Google Scholar]
  21. Sun, L.; Qian, J.; Chen, X. LDP-FL: Practical private aggregation in federated learning with local differential privacy. arXiv 2020, arXiv:2007.15789. [Google Scholar]
  22. Duan, J.; Zhou, J.; Li, Y. Privacy-preserving distributed deep learning based on secret sharing. Inf. Sci. 2020, 527, 108–127. [Google Scholar] [CrossRef]
  23. Li, Y.; Zhou, Y.; Jolfaei, A.; Yu, D.; Xu, G.; Zheng, X. Privacy-preserving federated learning framework based on chained secure multiparty computing. IEEE Internet Things J. 2020, 8, 6178–6186. [Google Scholar] [CrossRef]
  24. Xu, R.; Baracaldo, N.; Zhou, Y.; Anwar, A.; Ludwig, H. Hybridalpha: An efficient approach for privacy-preserving federated learning. In Proceedings of the P12th ACM Workshop on Artificial Intelligence and Security, London, UK, 15 November 2019; pp. 13–23. [Google Scholar]
  25. Columbia University, Department of Computer Science. COMS W4261: Introduction to Cryptography. 2019. Available online: https://www.cs.columbia.edu/~tal/4261/F19/secretsharingf19.pdf (accessed on 11 August 2023).
  26. Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [Google Scholar] [CrossRef]
  27. Bisong, E.; Bisong, E. Google colaboratory. In Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners; Apress: Berkeley, CA, USA, 2019; pp. 59–64. [Google Scholar]
  28. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the Annual Conference on Neural Information Processing Systems 2019, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
  29. Deng, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]
  30. Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
Figure 1. t-out-of-n federated learning architecture diagram. In this example, t is 2 and n is 4. After training the model locally, the client divides the gradients into 4 shares and distributes them to all servers. Each server aggregates the received shares. The selected 2 servers, the leftmost server and the rightomst server here, then return the aggregated shares to the client, which aggregates the t received shares to update the model.
Figure 1. t-out-of-n federated learning architecture diagram. In this example, t is 2 and n is 4. After training the model locally, the client divides the gradients into 4 shares and distributes them to all servers. Each server aggregates the received shares. The selected 2 servers, the leftmost server and the rightomst server here, then return the aggregated shares to the client, which aggregates the t received shares to update the model.
Applsci 14 08959 g001
Figure 2. C i splits the gradients after local training and sends the gradients and data volume to Svr s .
Figure 2. C i splits the gradients after local training and sends the gradients and data volume to Svr s .
Applsci 14 08959 g002
Figure 3. From n servers, select t servers, and calculate the weighted average of the selected t servers.
Figure 3. From n servers, select t servers, and calculate the weighted average of the selected t servers.
Applsci 14 08959 g003
Figure 4. All clients update the model after receiving w from the t servers.
Figure 4. All clients update the model after receiving w from the t servers.
Applsci 14 08959 g004
Figure 5. Process diagram for providing models of different accuracy to clients of varying levels.
Figure 5. Process diagram for providing models of different accuracy to clients of varying levels.
Applsci 14 08959 g005
Figure 6. Comparison of the accuracy of our proposed method with FedAvg [2] and FedShare [8] using the IID MNIST [29] dataset with 5 and 10 servers and varying numbers of clients.
Figure 6. Comparison of the accuracy of our proposed method with FedAvg [2] and FedShare [8] using the IID MNIST [29] dataset with 5 and 10 servers and varying numbers of clients.
Applsci 14 08959 g006
Figure 7. Comparison of the accuracy of our proposed method with FedAvg [2] and FedShare [8] using the IID Fashion-MNIST [30] dataset with 5 and 10 servers and varying numbers of clients.
Figure 7. Comparison of the accuracy of our proposed method with FedAvg [2] and FedShare [8] using the IID Fashion-MNIST [30] dataset with 5 and 10 servers and varying numbers of clients.
Applsci 14 08959 g007
Figure 8. Comparison of the accuracy of our proposed method with FedAvg [2] and FedShare [8] using the Non-IID MNIST [29] and Non-IID Fashion-MNIST [30] datasets with 5 servers and varying numbers of clients.
Figure 8. Comparison of the accuracy of our proposed method with FedAvg [2] and FedShare [8] using the Non-IID MNIST [29] and Non-IID Fashion-MNIST [30] datasets with 5 servers and varying numbers of clients.
Applsci 14 08959 g008
Figure 9. Comparison of the average execution time using the IID MNIST [29] and IID FashionMNIST [30] datasets for different numbers of clients and servers with FedAvg [2], FedShare [8], and our method.
Figure 9. Comparison of the average execution time using the IID MNIST [29] and IID FashionMNIST [30] datasets for different numbers of clients and servers with FedAvg [2], FedShare [8], and our method.
Applsci 14 08959 g009
Figure 10. Comparison of the average execution time using the Non-IID MNIST [29] and Non-IID FashionMNIST [30] datasets for different numbers of clients and servers with FedAvg [2], FedShare [8], and our method.
Figure 10. Comparison of the average execution time using the Non-IID MNIST [29] and Non-IID FashionMNIST [30] datasets for different numbers of clients and servers with FedAvg [2], FedShare [8], and our method.
Applsci 14 08959 g010
Figure 11. The probability that the mechanism fails due to the number of damaged servers for 4-out-of-16 and 5-out-of-25 situations.
Figure 11. The probability that the mechanism fails due to the number of damaged servers for 4-out-of-16 and 5-out-of-25 situations.
Applsci 14 08959 g011
Figure 12. The probability that an attacker successfully reconstructs the gradient by attacking d servers for the 4-out-of-16 and 5-out-of-25 situations.
Figure 12. The probability that an attacker successfully reconstructs the gradient by attacking d servers for the 4-out-of-16 and 5-out-of-25 situations.
Applsci 14 08959 g012
Figure 13. Box plot of gradient model accuracy at different magnifications (50 times).
Figure 13. Box plot of gradient model accuracy at different magnifications (50 times).
Applsci 14 08959 g013
Table 1. Notations.
Table 1. Notations.
NotationDescription
Svr server
C client
nnumber of servers
tnumber of selected threshold
qnumber of bits converted from server number, q = log t ( n 1 )
znumber of clients
M i the gradient of the i-th client’s model
m i , j , k the k-th share of the j-th secret shared by the i-th client
l i the amount of data from the ith client
Lthe total data amount, L = l i
hthe different bit numbers of the selected server
wthe weighted average obtained by servers
n u m the set of numbers of the selected t servers
Sthe set of the number of all servers
r i the random number for adjusting gradient of model i-th layer
pthe number of layers in the model
l v the membership level
T G the Gradient Adjustment Table generated by the level server
Table 2. Accuracy of models with gradients of different magnifications (50 times).
Table 2. Accuracy of models with gradients of different magnifications (50 times).
Accuracy (%)MaximumMinimumAveragePercentage Drop
Random
Original95.0994.0994.62
(1, 10)94.8594.0494.540.08
(10, 20)94.7794.0794.440.18
(20, 30)94.2689.3292.622.11
(30, 40)94.7781.4591.643.15
(40, 50)94.3562.5289.295.63
(50, 60)94.3451.1183.9711.25
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ting, Y.-T.; Chi, P.-W.; Kuo, C.-T. A Reliable Aggregation Method Based on Threshold Additive Secret Sharing in Federated Learning with Quality of Service (QoS) Support. Appl. Sci. 2024, 14, 8959. https://doi.org/10.3390/app14198959

AMA Style

Ting Y-T, Chi P-W, Kuo C-T. A Reliable Aggregation Method Based on Threshold Additive Secret Sharing in Federated Learning with Quality of Service (QoS) Support. Applied Sciences. 2024; 14(19):8959. https://doi.org/10.3390/app14198959

Chicago/Turabian Style

Ting, Yu-Ting, Po-Wen Chi, and Chien-Ting Kuo. 2024. "A Reliable Aggregation Method Based on Threshold Additive Secret Sharing in Federated Learning with Quality of Service (QoS) Support" Applied Sciences 14, no. 19: 8959. https://doi.org/10.3390/app14198959

APA Style

Ting, Y. -T., Chi, P. -W., & Kuo, C. -T. (2024). A Reliable Aggregation Method Based on Threshold Additive Secret Sharing in Federated Learning with Quality of Service (QoS) Support. Applied Sciences, 14(19), 8959. https://doi.org/10.3390/app14198959

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop