8000 xiaohangt (Tim Tang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View xiaohangt's full-sized avatar

Block or report xiaohangt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The official implementation of Self-Play Preference Optimization (SPPO)

Python 569 47 Updated Jan 23, 2025

A recipe for online RLHF and online iterative DPO.

Python 521 49 Updated Dec 28, 2024

Official repository of the paper "Regret-Minimizing Double Oracle for Extensive-Form Games", ICML 2023.

Jupyter Notebook 2 4 Updated Jul 11, 2024

Must-read papers and resources related to causal inference and machine (deep) learning

727 129 Updated Nov 23, 2022

Environments for OR and RL Research

Python 409 96 Updated Oct 12, 2023
Python 346 54 Updated Oct 12, 2022

An elegant PyTorch deep reinforcement learning library.

Python 6B7C 8,611 1,163 Updated Jul 3, 2025

rllab is a framework for developing and evaluating reinforcement learning algorithms, fully compatible with OpenAI Gym.

Python 2,979 801 Updated Jun 10, 2023

Ultimate Solidity, Blockchain, and Smart Contract - Beginner to Expert Full Course | Python Edition

11,038 2,945 Updated Apr 16, 2024

Bayesian optimisation & Reinforcement Learning library developed by Huawei Noah's Ark Lab

Jupyter Notebook 2,630 451 Updated Jun 25, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 37,848 6,555 Updated Jul 5, 2025

PyTorch implementation of Trust Region Policy Optimization

Python 441 90 Updated Sep 13, 2018

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

180,813 51,169 Updated Aug 21, 2024

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Jupyter Notebook 21,404 6,139 Updated Jul 13, 2023
0