This project is developed with the following goals in mind:
- Implement a simple Raft algorithm in Python with an abstract networking/communication and persisted storage layer
- Provide tools to test and trace the reliability of the implementation, including a network proxy to simulate network partition, packet drops/delays etc.
This repos is tested with Python 3.10 under WSL Ubuntu:
python -m venv venv
source ./venv/bin/activate
python -m pip install -v --upgrade pip
pip install -v -r requirements.txt
All options are configured in config.json
.
To run 3 servers named A
, B
, C
, and a network proxy:
Run each command in a separate terminal:
python -m raft.main ./config.json A
python -m raft.main ./config.json B
python -m raft.main ./config.json C
python -m raft.main ./config.json Proxy
Persisted databases will be written to the db/
folder.
Logs will be written to logs/
folder, and you can consolidate all log files into one by:
python -m raft.monitor
This can help you understand the timeline for a particular issue across different servers.
You can run tests against the servers with:
python -m raft.client
We have implemented 3 generic correctness tests:
- Read-after-Write consistency test
- Write to one node
- Read immediately another node, make sure it returns what we just wrote
- Fault tolerant linearizable consistency test:
- Isolate one node out of the network
- Write to another node
- Read from a third node, make sure it returns what we wrote
- Resume network
- Read immediately the previously isolated node, make sure it returns what we wrote
- Eventual consistency test:
- Isolate one node out of the network
- Write to another node
- Poll periodically and make sure that eventually all nodes return the data we wrote
, and 2 performance tests:
- Write performance test: write 100 unique values sequentially, each to a randomly selected node
- Read performance test: read 100 times sequentially, each to a random node
Currently these tests run against both the "Attiya, Bar-Noy, Dolev" (ABD) quorum get/set algorithm (raft/routers/abd.py
), and the Raft algorithm (raft/routers/raft.py
).
Both algorithms provide fault-tolerant linearizable consistency, so it will pass all the tests given that we don't try to read from the faulty node(s).
We also have a simple test for the Raft state machine, where we append fibonacci numbers sequentially only after validating (at the Raft server side) the previous state is unchanged, i.e. atomic compare and swap.
Video Lectures:
- Distributed Systems, Cambridge 2122 (Martin Kleppmann): Youtube
- Distributed Systems, MIT 6.824 (Robert Morris): Youtube
Papers: