Stars
这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Example models using DeepSpeed
The simplest, fastest repository for training/finetuning medium-sized GPTs.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
A fork to add multimodal model training to open-r1
Solve Visual Understanding with Reinforced VLMs
WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Yelp Simulator for WWW'25 AgentSociety Challenge
Retrieval and Retrieval-augmented LLMs
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube
Fully open reproduction of DeepSeek-R1
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)
本项目是mini_qwen项目的后续实验,是为了探究大模型复读机现象的成因与微调阶段模型知识注入现象的普遍性。
这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。
Train a 1B LLM with 1T tokens from scratch by personal
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.