Lists (4)
Sort Name ascending (A-Z)
Stars
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
A Business-Driven Real-World Financial Benchmark for Evaluating LLMs
Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Open-source Multi-agent Poster Generation from Papers
Code for the paper: "Learning to Reason without External Rewards"
The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
Code for paper "SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation"
A benchmark for LLMs on complicated tasks in the terminal
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
[ACL 2025 Findings] Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering
Obsidian Weread Plugin is a plugin to sync Weread(微信读书) hightlights and annotations into your Obsidian Vault.
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining
Official Repository of "Learning to Reason under Off-Policy Guidance"
Official Repository of Absolute Zero Reasoner
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.
official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”
YiZhao: A 2TB Open Financial Corpus. Data and tools for generating and inspecting YiZhao, a safe, high-quality, open-source bilingual financial corpus (Chinese and English).
GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph Generation
My learning notes/codes for ML SYS.