8000 GitHub - edzhangsy/CS277: The CS277 project repo
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

edzhangsy/CS277

Repository files navigation

The 277 project

In this repository we aim to benchmark the hinderance that Fully Homomorphic Encryption (FHE) introduces, impacting efficiency.

Contents

Introduction

Our project simulates a Federated Learning Model (FL) with its topology as shown below.

FL Model

There are three separate tests that we run:

  1. Base Case - This example will only transfer information in plaintext under our FL model.
  2. Base Case + FHE - This example will transfer information under FHE and do computation only in the aggregator node under our FL model.
  3. Base Case + FHE + In-Network Processing - This example attempts to optimize and improve the efficiency of the process by introducing in-network processing, which offloads some of the computation to the middle switches in our FL model.

We successfully implement our FL model with the use of 3 major libraries:

  1. Flask - A python web application framework that enables our servers to listen and transfer data. We modify the routes to simulate a FL model environment.
  2. TenSEAL - A library that enables homomorphic encryption operations to be done on tensors. We leveraged this library to easily work with ML applications.
  3. Pytorch< 8000 /a> - A python framework that enables us to effortlessly implement ML models.

Acronym: FL1.
Acronym: FHE2.

Metrics

The motivation for this project was based on wanting to understand how much applications would suffer when FHE is implemented in order to protect users privacy. We wanted to gain an understanding and see how feasible it is to take the tradeoff that FHE provides. For this reason, there are several metrics that we would like to investigate.

Metric Name Base Base + FHE Base + FHE + INP
Peak Memory (Gb) 10 20 15
Process Time (s) 5 8 12
Max File Size (Gb) 30 25 40
Time Saved (s) 30 25 40

Acronym: INP3.

Running FL Simulation

In this section we will describe the needed requirements and interface setups for our FL model simulation.

Installation

Clone this GitHub repository

git clone https://github.com/edzhangsy/CS277.git

Change Directory into the folder

cd CS277

Make the setup file an executable and run

chmod +x setup.sh
./setup.sh

Flask Setup

We combine the code for three roles in one repository.

To run the cuurent setup, begin by starting the switch and client nodes

python main.oy

Run the aggregator node

python main.py agg

After open up the localhost webpage and change the url route to /train

Note: Remember to start the server in background so it doesn't get killed when you disconnet your ssh.

The aggregator will first read the config.json file. Then it will call the other machines' config interface and send them their config. The config for other clients are stored under the others dictionary. The key is the IP address, The value is the config. Send the config to the corresponding client. After receiving the config, which is a json from the aggregator, the other servers will register the blueprint dynamically based on the type in the config. Check the config.json file and modify configs if needed.

The configs should be self-explainatory.

When the type is client, there is client_number and index. Which indicates how many clients is used in this experiment, and current client's index. This is useful to divide the training and testing data set. For example, a client is index 0 among 4 clients, so he will slice the data set into 4 slices, and he operates on the index 0 of the slices.

When you want to start another experiment, just edit the config.json and restart the aggregator.

Development

Because different types of servers have their corresponding unique interfaces, we separate them into different blueprints. See the flask documentation for what is blueprint. If you are developing the client, just edit the client.py and add interfaces. The config that is received from the server is stored in the config variable in client.py or switch.py For example, if you are developing the client, the config that received from the server should be accessible in local config file. If you want to send some data to the switch, just read the config, and get which address you should send to. Then send to the interface of that address, for example, http://10.10.1.5:5000/s/receive. Then, look at the status code!! If it's 200, that is successful. Look for the flask documentation for how to check the status code. Also, you should write the log into the local log directory. Just use some dictionary to store the logs.

For example, the log dictionary can look like this.

{
    "iteration": [
        {
            "start_time": "timestamp",
            "end_time": "timestamp",
            "byte_received": 50,
            "byte_send": 100
        },
        {
            "start_time": "timestamp",
            "end_time": "timestamp",
            "byte_received": 50,
            "byte_send": 100
        },
        {
            "start_time": "timestamp",
            "end_time": "timestamp",
            "byte_received": 50,
            "byte_send": 100
        }
    ]
}

There are 15 machines on the cloudlab. The 15 machines are connected physically using one switch. And the address beginning with 10.10 is the local address. The node0 has address 10.10.1.1. The node1 has address 10.10.1.2. So on and so forth.

For convenience, let's use the node14, 10.10.1.15 as the aggregator.

Remember not to add unless files when you commit.

Remember to start the all the clients first, then start the aggregator

Maybe you should open the port 5000 using the iptables

Microsoft SEAL

SEAL is an open source homomorphic encryption library developed by the Cryptography and Privacy Research Group at Microsoft.

For more information, visit the Microsoft SEAL GitHub repository.

PySEAL

Initially the PySEAL library was chosen for its ease of use being directly compatible with our setup written in Python.

Note: Microsoft SEAL library does not work with Tensors out of the box.

Since PySEAL simply invokes a python wrapper to the Microsoft SEAL library, we will have to modify futher if we want to use it on Tensors.

Below are the directions for running PySEAL and for more information, visit the PySEAL GitHub repository.

Installation

PySeal library should be compiled first. After compilation, you can see the seal.*.so Copy it under the directory of the this repo, and you should be able to use it by import seal

You can run the seal.sh to set it up.

The examples to use the seal is included in the 5_ckks_basics.py.

Because the seal-python is using the pybind to bind the original c++ library, we are dealing with python objects wrapping the c++ objects. It's useful to use the dir() function to look at what methods are available for use. For example, after generating the secret_key, use the dir(secret_key) and we can find the save, load and to_string methods.

TenSEAL

TanSEAL is a library built on top of the Microsoft SEAL library.

It introduces extra features such as Dot Product and Tensors that makes Machine Learning applications easy to invoke FHE.

The examples to use the seal is included in the tenseal_ckks.py.

For more information, visit the TenSEAL GitHub repository.

ML Model Running on Pytorch

For our ML model, we took an existing Pytorch implementation of a basic 2-layer neural network training on the MNIST dataset.

We have made some modifications to the mnist.py file where it writes the weights and biases vectors into a json file allowing us to then call TenSEAL and encrypt the training tensors directly with FHE.

When we complete our Federated Learning model, the client will call the replace_weights_mnist.py where it will load the averaged weights returned by the Aggregator and resume training.

Scenarios

In this section we will describe in more detail the the simulation logic for each scenario.

Base Case

For the Base Case, the Client Nodes will begin by training its ML models locally. Then after some training iterations, the client will send its parameters up to the Aggregator Nodes. In this scenario, our Switch Nodes will act as dumb switches that simply forward the files as they come up to the Aggregator Nodes.

Once the Aggregator Node receives all the necessary files it will aggregate, average them, then send the new set of files back to all Client Nodes.

Finally, once the Client Nodes receives the files back from the Aggregator Node, it will updates its values in the ML model and continue training through a warmstart.

Base Case with FHE

For the Base Case with FHE, the Client Nodes will begin by training its ML models locally. Then after completing its training, the client will encrypt the parameters and send it as a ciphertext up to the Aggregator Nodes. In this scenario, our Switch Nodes will act as dumb switches that simply forward the files as they come up to the Aggregator Nodes.

Once the Aggregator Node receives all the necessary files it will aggregate, average them, then decrypt the ciphertext back to plaintext before sending the new set of files back to all Client Nodes.

Finally, once the Client Nodes receives the files back from the Aggregator Node, it will updates its values in the ML model and continue training through a warmstart.

Base Case with FHE with In-Network Processing

For the Base Case with FHE with In-Network Processing, the Client Nodes will begin by training its ML models locally. Then after completing its training, the client will encrypt the parameters and send it as a ciphertext up to the Aggregator Nodes. In this scenario, our Switch Nodes will help decrease the load for the aggregator and perform local aggregation combining two files into one before forwarding the files up to the Aggregator Nodes.

Once the Aggregator Node receives all the necessary files it will aggregate the remaining files, average them, then decrypt the ciphertext back to plaintext before sending the new set of files back to all Client Nodes.

Finally, once the Client Nodes receives the files back from the Aggregator Node, it will updates its values in the ML model and continue training through a warmstart.

Footnotes

  1. Federated Learning

  2. Fully Homomorphic Excryption

  3. In-Network Processing

About

The CS277 project repo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  
0