Stars
The official implementation of Self-Play Preference Optimization (SPPO)
A recipe for online RLHF and online iterative DPO.
Official repository of the paper "Regret-Minimizing Double Oracle for Extensive-Form Games", ICML 2023.
Must-read papers and resources related to causal inference and machine (deep) learning
An elegant PyTorch deep reinforcement learning library.
rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.
Ultimate Solidity, Blockchain, and Smart Contract - Beginner to Expert Full Course | Python Edition
Bayesian optimisation & Reinforcement Learning library developed by Huawei Noah's Ark Lab
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
PyTorch implementation of Trust Region Policy Optimization
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.