8000 GitHub - ToniXWD/toni-lsm: A KV storage engine based on LSM Tree, supporting Redis RESP
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ToniXWD/toni-lsm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

86 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation


C++20MIT licenseplatform Linux x86_64 Version Badge

Toni-LSM is a educational project to implement a simple kv database from scratch, using lsm-tree as the storage engine. The project manager uses xmake. The project is inspired by mini-lsm, tinykv and leveldb. The project is partly compatible with the Redis Resp protocol, so it can be used as a redis backend and relpace redis-server(Just for fun ๐ŸŽฎ).

The developing posts can be found in my blog or zhihu. The is also a recorded video Course.

๐Ÿ“š New: Lab Course Released!

We now offer a complete hands-on step-by-step Lab course designed for learners interested in building LSM-based storage engines from scratch.

Project production is not easy, please click star โญ to support us โค๏ธ

๐Ÿ‘‰ Toni-LSM Lab Course

You can also join the QQ group ๐Ÿ’ฌ for discussion:

๐Ÿ“ฆ Build Configuration

The project uses xmake as the build system. Below is the xmake configuration for building the project and running tests:

  1. Compile the project
xmake
  1. Run the example program or test
xmake run example
xmake run test_lsm
  1. Install the shared library
xmake install --root lsm_shared

๐Ÿ› ๏ธ Usage

Use as a library

Here is a simple example demonstrating how to use the LSM Tree for basic key-value operations:

#include "../include/lsm/engine.h"
#include "../include/lsm/level_iterator.h"
#include <iostream>
#include <string>

using namespace ::toni_lsm;

int main() {
  // create lsm instance, data_dir is the directory to store data
  LSM lsm("example_data");

  // put data
  lsm.put("key1", "value1");
  lsm.put("key2", "value2");
  lsm.put("key3", "value3");

  // Query data
  auto value1 = lsm.get("key1");
  if (value1.has_value()) {
    std::cout << "key1: " << value1.value() << std::endl;
  } else {
    std::cout << "key1 not found" << std::endl;
  }

  // Update data
  lsm.put("key1", "new_value1");
  auto new_value1 = lsm.get("key1");
  if (new_value1.has_value()) {
    std::cout << "key1: " << new_value1.value() << std::endl;
  } else {
    std::cout << "key1 not found" << std::endl;
  }

  // delete data
  lsm.remove("key2");
  auto value2 = lsm.get("key2");
  if (value2.has_value()) {
    std::cout << "key2: " << value2.value() << std::endl;
  } else {
    std::cout << "key2 not found" << std::endl;
  }

  // iterator
  std::cout << "All key-value pairs:" << std::endl;
  // begin(id): id means transaction id, 0 means disable mvcc
  for (auto it = lsm.begin(0); it != lsm.end(); ++it) {
    std::cout << it->first << ": " << it->second << std::endl;
  }

  // transaction
  auto tranc_hanlder = lsm.begin_tran(IsolationLevel::REPEATABLE_READ);
  tranc_hanlder->put("xxx", "yyy");
  tranc_hanlder->put("yyy", "xxx");
  tranc_hanlder->commit();

  auto res = lsm.get("xxx");
  std::cout << "xxx: " << res.value() << std::endl;

  lsm.clear();

  return 0;
}

Use to replace redis-server

โš ๏ธ Now the project only partly compatible with the Redis Resp protocol, you can check TODO for the supported Redis commands.

xmake run server

Then you can use redis-cli to connect to the server:

redis-example

The project is under development, and the current version is not stable.

๐Ÿ“ˆ Benchmark

We use the official tool redis-benchmark to test the performance of the wrapper redis server. The QPS of commonly used commands are relatively high, considering its IO between memroy and disk.

๐Ÿ” The testing environment is: Win11 WSL Ubuntu 22.04, 32GB 6000 RAM, Intel 12600K.

(base) โžœ  ~ redis-benchmark -h 127.0.0.1 -p 6379 -c 100 -n 100000 -q -t SET,GET,INCR,SADD,HSET,ZADD
WARNING: Could not fetch server CONFIG
SET: 142653.36 requests per second, p50=0.527 msec
GET: 134589.50 requests per second, p50=0.503 msec
INCR: 132802.12 requests per second, p50=0.503 msec
SADD: 131233.59 requests per second, p50=0.519 msec
HSET: 123456.79 requests per second, p50=0.583 msec
ZADD: 126422.25 requests per second, p50=0.615 msec

๐Ÿ” Besides, the QPS of lpush/rpush is so slow that its design needs to be optimized.

The performance of the wrapper redis server is not very good, but it is still fast enough for most use cases. Howerver, the redis server is built based on the LSM Tree KV engine, so it consists of some redundancy locks. If you use the LSM Tree KV engine directly, you can get much better performance. The reason why we use redis-benchmark to test the compatible redis api but not the actual KV engine API is that writing a sophisticated testing tool costs much time so we choose to use the existed redis-benchmark.

โœ… Features && TODO

  • SkipList
    • get/put/remove
    • iterator
    • Range Query
  • MemTable
    • Iterator
    • Range Query
    • flush to sst
  • SST
    • Encode/Decode
    • Iterator
    • Query
    • Range Query
    • Compact
  • Wal
    • Sync Wal
    • Async Wal
    • Recover
  • Transaction
    • MVCC
    • Isolation Level
      • Read Uncommitted
      • Read Committed
      • Repeatable Read (Default)
      • Serializable
  • Config
    • Toml Config
  • Redis
    • Fundamental KV Operations
      • set/get
      • ttl/expire
    • Hash Operations
      • hset/hget/hdel
      • hkeys
    • List Operations
      • lpush/rpush/lpop/rpop
      • llen
      • lrange
    • ZSet
      • ZADD/ZREM/ZINCRBY
      • ZCARD
      • ZRANGE
      • ZSCORE
      • ZRANK
    • Set
      • SADD/SREM
      • SMEMBERS/SISMEMBER
      • SCARD
    • IO Operations
      • FLUSHALL
      • SAVE
    • SDK
      • Python

๐Ÿ” Only commonly used redis commands are supported. The other implementations can refer to src/redis_wrapper/redis_wrapper.cpp. If you need more commands, please submit a pull request.

Redis Resp

Currently, the project only supports a subset of the Redis Resp protocol. The supported commands are listed above. You can add more commands by following the existing implementations in src/redis_wrapper/redis_wrapper.cpp.

๐Ÿงช SDK Development

If you want to develop a new SDK for the project, please refer to the existing SDKs in src/sdk/. Curently, we only have a Python SDK. You can use build the python sdk as following:

cd sdk
./build_python_sdk.sh

Then you can use the SDK in your Python code:

import tonilsm

db = tonilsm.LSM("test2_db")
db.put(b"tomxx", b"catxx")
db.get("tomxx")

t = db.begin_tran()
t.get('tomxx')
t.put('tomxx','1')
t.get('tomxx')
t.commit()

db.get("tomxx")

We welcome contributions for developing SDKs in other programming languages.

๐Ÿค Contribute

If you want to contribute to the project, please check the issues. Also, you can concat me by email๐Ÿ“ง

Lab Construction

I have a plan to make this project a step-to-step Lab like CMU15445 or MIT 6.824. If you are interested in this project, please feel free to contact me by email๐Ÿ“ง

Besides, the branch lab-dev is a development branch for the lab construction.

๐Ÿ“œ License

This project is licensed under the MIT License.

0