8000 wyxscir / Starred · GitHub

More Web Proxy on the site http://driver.im/

wyxscir

Follow

🍒

wyxscir

🍒

Follow

wangyuxin1025@163.com

9 followers · 45 following

beijing

Lists (4)

Sort

efficient

14 repositories

largemodel

65 repositories

papercode

tools

Stars

JiangInsight / Finmaster

Python 2 Updated May 18, 2025

kxfan2002 / SophiaVL-R1

SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward

Python 48 1 Updated Jun 3, 2025

HiThink-Research / BizFinBench

A Business-Driven Real-World Financial Benchmark for Evaluating LLMs

Python 188 5 Updated Jun 5, 2025

QingyangZhang / Label-Free-RLVR

197 3 Updated Jun 4, 2025

jennyzzt / dgm

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

Python 1,239 254 Updated Jun 12, 2025

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 834 46 Updated Jun 12, 2025

Paper2Poster / Paper2Poster

Open-source Multi-agent Poster Generation from Papers

Python 1,981 100 Updated Jun 4, 2025

ruixin31 / Spurious_Rewards

Python 278 18 Updated Jun 8, 2025

sunblaze-ucb / Intuitor

Code for the paper: "Learning to Reason without External Rewards"

Python 273 23 Updated Jun 12, 2025

MiniMax-AI / SynLogic

The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 119 5 Updated Jun 3, 2025

EffiVLM-Bench / EffiVLM-Bench

Python 9 Updated Jun 3, 2025

Kelaxon / SSR-Zero

Code for paper "SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation"

5 Updated May 28, 2025

laude-institute / terminal-bench

A benchmark for LLMs on complicated tasks in the terminal

Shell 161 38 Updated Jun 12, 2025

Gen-Verse / MMaDA

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,049 47 Updated Jun 4, 2025

zchuz / SiGIR-MHQA

[ACL 2025 Findings] Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering

Python 2 Updated May 21, 2025

zhaohongxuan / obsidian-weread-plugin

Obsidian Weread Plugin is a plugin to sync Weread(微信读书) hightlights and annotations into your Obsidian Vault.

TypeScript 1,453 86 Updated May 6, 2025

RLHFlow / Minimal-RL

Python 202 9 Updated May 14, 2025

QwenLM / ParScale

Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling

Python 381 15 Updated May 17, 2025

LoverLost / EffiVLM-Bench

Python 1 Updated May 19, 2025

XiaomiMiMo / MiMo

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,443 62 Updated Jun 5, 2025

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 226 23 Updated Jun 3, 2025

LeapLabTHU / Absolute-Zero-Reasoner

Official Repository of Absolute Zero Reasoner

Python 1,508 254 Updated Jun 2, 2025

Alibaba-NLP / ZeroSearch

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 1,002 91 Updated Jun 12, 2025

yfzhang114 / r1_reward

✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Python 143 7 Updated May 9, 2025

RLHFlow / GVM

Python 13 Updated May 7, 2025

ByteDance-Seed / Seed-Coder

Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.

496 34 Updated Jun 6, 2025

ypwang61 / One-Shot-RLVR

official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”

Python 282 25 Updated Jun 9, 2025

HITsz-TMG / YiZhao

YiZhao: A 2TB Open Financial Corpus. Data and tools for generating and inspecting YiZhao, a safe, high-quality, open-source bilingual financial corpus (Chinese and English).

Python 26 3 Updated Dec 12, 2024

idea-iitd / graphgen

GraphGen: A Scalable Approach to Domain-agnostic Labeled Graph Generation

C++ 59 16 Updated Jul 6, 2023

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 2,442 155 Updated Jun 12, 2025

0