GitHub - awesome-software/simpleRL-reason: This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

8000 GitHub - awesome-software/simpleRL-reason: This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

More Web Proxy on the site http://driver.im/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Reinforcement Learning for Reasoning

News

Introduction

Quick Start

Citation

Acknowledgement

Star History

About

Releases

Packages

Languages

awesome-software/simpleRL-reason

Folders and files

Latest commit

History

Repository files navigation

Simple Reinforcement Learning for Reasoning

News

Introduction

Quick Start

Citation

Acknowledgement

Star History

About

Releases

Packages 0

Languages

Packages