8000

rxlqn

🏠

Working from home

Weiyuan Li rxlqn

🏠

Working from home

RL, LLM, Agent

9 followers · 18 following

Shanghai

Achievements

Lists (12)

Sort

Stars

openai / summarize-from-feedback

Code for "Learning to summarize from human feedback"

Python 1,022 148 Updated Sep 5, 2023

QwenLM / AutoIF

Python 279 26 Updated Jul 25, 2024

thamsuppp / summary-eval-article

Code for Medium article How to Evaluate LLM Summarization

Jupyter Notebook 7 1 Updated Jan 20, 2025

openai / openai-agents-python

A lightweight, powerful framework for multi-agent workflows

Python 9,991 1,342 Updated May 2, 2025

punkpeye / awesome-mcp-servers

A collection of MCP servers.

46,836 3,449 Updated May 7, 2025

WLiK / LLM4Rec-Awesome-Papers

A list of awesome papers and resources of recommender system on large language model (LLM).

1,829 139 Updated Mar 17, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, GLM4, Mistral, Yi1.5, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, …

Python 7,415 630 Updated May 8, 2025

Azure / gpt-rag-agentic

Python 81 34 Updated Apr 10, 2025

zhentingqi / rStar

Python 928 103 Updated Jan 23, 2025

dhcode-cpp / X-R1

minimal-cost for training 0.5B R1-Zero

Python 716 89 Updated Apr 25, 2025

jadolg / rocketchat_API

Python API wrapper for Rocket.Chat

Python 282 93 Updated Apr 22, 2025

mckaywrigley / chatbot-ui

AI chat for any model.

TypeScript 31,163 8,803 Updated Aug 3, 2024

RocketChat / Rocket.Chat

The communications platform that puts data protection first.

TypeScript 42,597 11,713 Updated May 8, 2025

openimsdk / open-im-server

IM Chat ChatGPT

Go 14,664 2,577 Updated May 6, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,533 265 Updated Apr 10, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,349 743 Updated Apr 4, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 11,716 1,482 Updated Apr 24, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,329 154 Updated Mar 20, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,319 2,233 Updated May 8, 2025

doocs / leetcode

🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer（第 2 版）》、《程序员面试金典（第 6 版）》题解

Java 33,877 9,088 Updated May 6, 2025

unclecode / crawl4ai

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Python 42,531 3,899 Updated May 7, 2025

rxlqn / Awesome-LLM-RL

This collection aims to present the ‘cherry on the cake’ of recent AI advancements in the realm of LLMs and RL.

1 Updated Jan 19, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,713 373 Updated May 6, 2025

huggingface / evaluation-guidebook

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Jupyter Notebook 1,330 82 Updated Jan 7, 2025

mnapoli / aoe2-de-macos

How to run Age of Empires 2 DE on macOS

111 6 Updated Aug 17, 2023

langchain-ai / memory-template

Python 120 25 Updated Dec 10, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,601 645 Updated May 7, 2025

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,363 152 Updated May 7, 2025

RLHFlow / RAFT

This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.

Python 29 3 Updated Sep 22, 2024

TeamWiseFlow / wiseflow

Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.

Python 7,411 1,333 Updated May 7, 2025

Weiyuan Li rxlqn

Lists (12)

agent

CLIP

generation

IM

leetcode

LLM

Navigation

optimal control

rag

Reinforcement Learning

tools

xiaoice

Stars