8000 rxlqn (Weiyuan Li) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View rxlqn's full-sized avatar
🏠
Working from home
🏠
Working from home
  • Shanghai

Block or report rxlqn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for "Learning to summarize from human feedback"

Python 1,022 148 Updated Sep 5, 2023
Python 279 26 Updated Jul 25, 2024

Code for Medium article How to Evaluate LLM Summarization

Jupyter Notebook 7 1 Updated Jan 20, 2025

A lightweight, powerful framework for multi-agent workflows

Python 9,991 1,342 Updated May 2, 2025

A collection of MCP servers.

46,836 3,449 Updated May 7, 2025

A list of awesome papers and resources of recommender system on large language model (LLM).

1,829 139 Updated Mar 17, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, GLM4, Mistral, Yi1.5, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, …

Python 7,415 630 Updated May 8, 2025
Python 81 34 Updated Apr 10, 2025
Python 928 103 Updated Jan 23, 2025

minimal-cost for training 0.5B R1-Zero

Python 716 89 Updated Apr 25, 2025

Python API wrapper for Rocket.Chat

Python 282 93 Updated Apr 22, 2025

AI chat for any model.

TypeScript 31,163 8,803 Updated Aug 3, 2024

The communications platform that puts data protection first.

TypeScript 42,597 11,713 Updated May 8, 2025

IM Chat ChatGPT

Go 14,664 2,577 Updated May 6, 2025

Simple RL training for reasoning

Python 3,533 265 Updated Apr 10, 2025

s1: Simple test-time scaling

Python 6,349 743 Updated Apr 4, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,716 1,482 Updated Apr 24, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,329 154 Updated Mar 20, 2025

Fully open reproduction of DeepSeek-R1

Python 24,319 2,233 Updated May 8, 2025

🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解

Java 33,877 9,088 Updated May 6, 2025

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Python 42,531 3,899 Updated May 7, 2025

This collection aims to present the ‘cherry on the cake’ of recent AI advancements in the realm of LLMs and RL.

1 Updated Jan 19, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,713 373 Updated May 6, 2025

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Jupyter Notebook 1,330 82 Updated Jan 7, 2025

How to run Age of Empires 2 DE on macOS

111 6 Updated Aug 17, 2023

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,601 645 Updated May 7, 2025

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,363 152 Updated May 7, 2025

This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.

Python 29 3 Updated Sep 22, 2024

Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.

Python 7,411 1,333 Updated May 7, 2025
Next
0