8000 GitHub - rlite-project/RLite: A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.

Notifications You must be signed in to change notification settings

rlite-project/RLite

Repository files navigation

RLite

🚀 Quick start🌰 Examples🍲 Recipes📚 Docs

A lightweight RL framework with PyTorch-like interfaces.

Features

  • FSDP2 and FSDP support for training.
  • vLLM support for inference.
  • ray support for resource management.
  • Easy to learn and use. Most interfaces are kept the same as Torch, with parallel engine working seamlessly behind the scenes.
  • Recipes that reproduce SOTA results with a single self-contained python script.

Installation

pip install pyrlite
Advanced installation options

We recommend using conda to manage our computation environment.

  1. Create a conda environment:
conda create -n rlite python==3.12
conda activate rlite
  1. Install common dependencies
# install vllm
pip install vllm accelerate

# flash attention 2 (make sure you have 64 CPU cores)
MAX_JOBS=64 pip install flash-attn --no-build-isolation

# Install fashinfer for faster inference
pip install flashinfer-python==0.2.2.post1 -i https://flashinfer.ai/whl/cu124/torch2.6
  1. Install rlite
git clone https://github.com/rlite-project/RLite.git
cd RLite; pip install -e .

Recipes

We use recipes as examples for reproducing SOTA RL methods. Featured recipes

Programming Model

Programming Model

In RLite, users mainly work with Engines, which is a handler that takes the input from the main process, organizes the tasks and sends to the workers. The engine may have multiple Executors, each holding a full set of model weights. Both Engines and Executors reside in the main process. The Workers are the units that actually perform computational tasks, with each Worker corresponding to a GPU. Conversely, a single GPU can be associated with multiple Workers, which can use the GPU in a time-multiplexed manner.

Key Interfaces

RLite provide minimal interfaces that are

  • easy to learn: most interfaces resembles the behavior of PyTorch.
  • super flexible: interfaces are independent and can be used seperately. This allows inference without training, e.g. evaluation tasks, or training without inference, e.g. SFT and DPO.
  • super powerful: the interfaces combined together allows reproduction of SOTA RL results.
  • highly extensible: the interfaces allows extensions for fancy features such as other train/inference backends, streaming generations for multi-turn use cases, asynchronized workers for overlapping time-consuming operations.

Inference

Inference Example

Train

Train Example

Offload/Reload/Discard Weights

Device Example

Synchronize Weights

Weight Sync

Contributing

Developer's guide.

Write code that you would like to read again.

We use pre-commit and git cz to sanitize the commits. You can run pre-commit before git cz to avoid repeatedly input the commit messages.

pip install pre-commit
# Install pre-commit hooks
pre-commit install
pre-commit install --hook-type commit-msg
# Install this emoji-style tool
sudo npm install -g git-cz --no-audit --verbose --registry=https://registry.npmmirror.com

# Install rlite
pip install -e ".[dev]"
Code Style
  • Single line code length is 99 characters, comments and documents are 79 characters.
  • Write unit tests for atomic capabilities to ensure that pytest does not throw an error.

Run pre-commit to automatically lint the code:

pre-commit run --all-files
Run Unit Tests:
# Only run tests
pytest

# Run tests and output test code coverage report
pytest --cov=rlite

About

A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0