RL-BLOX

This project contains modular implementations of various model-free and model-based RL algorithms and consists of deep neural network-based as well as tabular representation of Q-Values, policies, etc. which can be used interchangeably. The goal of this project is for the authors to learn by reimplementing various RL algorithms and to eventually provide an algorithmic toolbox for research purposes.

Caution

This library is still experimental and under development. Using it will not result in a good user experience. It is not well-documented, it is buggy, its interface is not clearly defined, its most interesting features are in feature branches. We recommend not to use it now. If you are an RL developer and want to collaborate, feel free to contact us.

Design Principles

The implementation of this project follows the following principles:

Algorithms are functions!
Algorithms are implemented in single files.
Policies and values functions are data containers.

Dependencies

Our environment interface is Gymnasium.
We use JAX for everything.
We use Chex to write reliable code.
For optimization algorithms we use Optax.
For probability distributions we use Distrax.
For all neural networks we use Flax NNX.
To save checkpoints we use Orbax.

Installation

git clone git@github.com:mlaux1/rl-blox.git

After cloning the repository, it is recommended to install the library in editable mo 9345 de.

pip install -e .

To be able to run the provided examples use pip install -e '.[examples]'. To install development dependencies, please use pip install -e '.[dev]'. You can install all optional dependencies using pip install -e '.[all]'.

Getting Started

RL-BLOX relies on gymnasium's environment interface. This is an example with the SAC RL algorithm.

import gymnasium as gym

from rl_blox.algorithms.model_free.sac import (
    create_sac_state,
    train_sac,
)

env_name = "Pendulum-v1"
env = gym.make(env_name)
seed = 1
verbose = 1
env = gym.wrappers.RecordEpisodeStatistics(env)

sac_state = create_sac_state(
    env,
    policy_hidden_nodes=[128, 128],
    policy_learning_rate=3e-4,
    q_hidden_nodes=[512, 512],
    q_learning_rate=1e-3,
    seed=seed,
)
sac_result = train_sac(
    env,
    sac_state.policy,
    sac_state.policy_optimizer,
    sac_state.q1,
    sac_state.q1_optimizer,
    sac_state.q2,
    sac_state.q2_optimizer,
    total_timesteps=11_000,
    buffer_size=11_000,
    gamma=0.99,
    learning_starts=5_000,
    verbose=verbose,
)
env.close()
policy, _, q1, _, _, q2, _, _, _ = sac_result

# Do something with the trained policy...

API Documentation

You can build the sphinx documentation with

pip install -e '.[doc]'
cd doc
make html

The HTML documentation will be available under doc/build/html/index.html.

Contributing

If you wish to report bugs, please use the issue tracker. If you would like to contribute to RL-BLOX, just open an issue or a pull request. The target branch for merge requests is the development branch. The development branch will be merged to master for new releases. If you have questions about the software, you should ask them in the discussion section.

The recommended workflow to add a new feature, add documentation, or fix a bug is the following:

Push your changes to a branch (e.g. feature/x, doc/y, or fix/z) of your fork of the RL-BLOX repository.
Open a pull request to the main branch.

It is forbidden to directly push to the main branch.

Testing

Run the tests with

pip install -e '.[dev]'
pytest

Releases

Semantic Versioning

Semantic versioning must be used, that is, the major version number will be incremented when the API changes in a backwards incompatible way, the minor version will be incremented when new functionality is added in a backwards compatible manner, and the patch version is incremented for bugfixes, documentation, etc.

Funding

This library is currently developed at the Robotics Group of the University of Bremen together with the Robotics Innovation Center of the German Research Center for Artificial Intelligence (DFKI) in Bremen.

Name		Name	Last commit message	Last commit date
Latest commit History 776 Commits
.ci		.ci
.github/workflows		.github/workflows
doc		doc
examples		examples
rl_blox		rl_blox
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL-BLOX

Design Principles

Dependencies

Installation

Getting Started

API Documentation

Contributing

Testing

Releases

Semantic Versioning

Funding

About

Releases

Packages

Contributors 3

Languages

License

mlaux1/rl-blox

Folders and files

Latest commit

History

Repository files navigation

RL-BLOX

Design Principles

Dependencies

Installation

Getting Started

API Documentation

Contributing

Testing

Releases

Semantic Versioning

Funding

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages