8000 GitHub - harikuts/leaf-del: Exploring further decentralization, communitization, and distribution of the models and data provided in the LEAF repository.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Exploring further decentralization, communitization, and distribution of the models and data provided in the LEAF repository.

License

Notifications You must be signed in to change notification settings

harikuts/leaf-del

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distribution, Decentralization, and Communitization of LEAF

Original LEAF Details

Resources

Datasets

  1. FEMNIST
  • Overview: Image Dataset
  • Details: 62 different classes (10 digits, 26 lowercase, 26 uppercase), images are 28 by 28 pixels (with option to make them all 128 by 128 pixels), 3500 users
  • Task: Image Classification
  1. Sentiment140
  • Overview: Text Dataset of Tweets
  • Details 660120 users
  • Task: Sentiment Analysis
  1. Shakespeare
  • Overview: Text Dataset of Shakespeare Dialogues
  • Details: 1129 users (reduced to 660 with our choice of sequence length. See bug.)
  • Task: Next-Character Prediction
  1. Celeba
  1. Synthetic Dataset
  • Overview: We propose a process to generate synthetic, challenging federated datasets. The high-level goal is to create devices whose true models are device-dependant. To see a description of the whole generative process, please refer to the paper
  • Details: The user can customize the number of devices, the number of classes and the number of dimensions, among others
  • Task: Classification
  1. Reddit
  • Overview: We preprocess the Reddit data released by pushshift.io corresponding to December 2017.
  • Details: 1,660,820 users with a total of 56,587,343 comments.
  • Task: Next-word Prediction.

Notes

  • Install the libraries listed in requirements.txt
    • I.e. with pip: run pip3 install -r requirements.txt
  • Go to directory of respective dataset for instructions on generating data
    • in MacOS check if wget is installed and working
  • models directory contains instructions on running baseline reference implementations

About

Exploring further decentralization, communitization, and distribution of the models and data provided in the LEAF repository.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 56.4%
  • Jupyter Notebook 35.3%
  • Shell 8.3%
0