8000 xiaohangt (Tim Tang) / Starred · GitHub

More Web Proxy on the site http://driver.im/

xiaohangt

Follow

Tim Tang xiaohangt

Follow

5 followers · 2 following

London, UK

Stars

Zhendong-Wang / Diffusion-Policies-for-Offline-RL

Python 368 43 Updated Apr 29, 2024

uclaml / SPPO

The official implementation of Self-Play Preference Optimization (SPPO)

Python 569 47 Updated Jan 23, 2025

RLHFlow / Online-RLHF

A recipe for online RLHF and online iterative DPO.

Python 521 49 Updated Dec 28, 2024

xiaohangt / RMDO

Official repository of the paper "Regret-Minimizing Double Oracle for Extensive-Form Games", ICML 2023.

Jupyter Notebook 2 4 Updated Jul 11, 2024

afonsosamarques / action-robust-decision-transformer

Jupyter Notebook 1 Updated Jan 12, 2024

jvpoulos / causal-ml

Must-read papers and resources related to causal inference and machine (deep) learning

727 129 Updated Nov 23, 2022

hubbs5 / or-gym

Environments for OR and RL Research

Python 409 96 Updated Oct 12, 2023

rll-research / url_benchmark

Python 346 54 Updated Oct 12, 2022

thu-ml / tianshou

An elegant PyTorch deep reinforcement learning library.

Python 6B7C 8,611 1,163 Updated Jul 3, 2025

rll / rllab

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.

Python 2,979 801 Updated Jun 10, 2023

smartcontractkit / full-blockchain-solidity-course-py

Ultimate Solidity, Blockchain, and Smart Contract - Beginner to Expert Full Course | Python Edition

11,038 2,945 Updated Apr 16, 2024

huawei-noah / HEBO

Bayesian optimisation & Reinforcement Learning library developed by Huawei Noah's Ark Lab

Jupyter Notebook 2,630 451 Updated Jun 25, 2025

ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 37,848 6,555 Updated Jul 5, 2025

ikostrikov / pytorch-trpo

PyTorch implementation of Trust Region Policy Optimization

Python 441 90 Updated Sep 13, 2018

CyC2018 / CS-Notes

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

180,813 51,169 Updated Aug 21, 2024

dennybritz / reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Jupyter Notebook 21,404 6,139 Updated Jul 13, 2023

0