10000 qiufengqijun / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View qiufengqijun's full-sized avatar

Block or report qiufengqijun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。

Python 28 Updated Apr 13, 2025

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 18,163 2,130 Updated May 27, 2025

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 145,021 29,177 Updated May 31, 2025

Example models using DeepSpeed

Python 6,511 1,092 Updated May 23, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 41,561 6,906 Updated Dec 9, 2024

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,512 178 Updated May 30, 2025

A fork to add multimodal model training to open-r1

Python 1,280 61 Updated Feb 8, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,025 308 Updated May 11, 2025

WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge

Python 123 15 Updated Nov 11, 2024

Reproduce R1 Zero on Logic Puzzle

Python 2,347 155 Updated Mar 20, 2025

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 21,465 2,529 Updated Apr 30, 2025

Yelp Simulator for WWW'25 AgentSociety Challenge

Python 79 23 Updated Apr 27, 2025

Retrieval and Retrieval-augmented LLMs

Python 9,803 718 Updated May 28, 2025

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript 44,888 5,416 Updated May 31, 2025

The official Meta Llama 3 GitHub site

Python 28,750 3,391 Updated Jan 26, 2025

手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube

Jupyter Notebook 2,910 400 Updated Jul 15, 2024

Fully open reproduction of DeepSeek-R1

Python 24,624 2,275 Updated May 28, 2025
Jupyter Notebook 63 5 Updated May 4, 2025

复现大模型相关算法及一些学习记录

Jupyter Notebook 1,540 225 Updated May 22, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & RFT & Dynamic Sampling & Async Agent RL)

Python 6,904 672 Updated May 30, 2025

本项目是mini_qwen项目的后续实验,是为了探究大模型复读机现象的成因与微调阶段模型知识注入现象的普遍性。

4 1 Updated Jan 22, 2025

这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。

Python 408 56 Updated Feb 18, 2025

Train a 1B LLM with 1T tokens from scratch by personal

Jupyter Notebook 664 70 Updated Apr 27, 2025

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 7,369 464 Updated Nov 6, 2024

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,801 333 Updated May 21, 2024

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,541 174 Updated Apr 20, 2024

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 552 62 Updated Jul 11, 2024

This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.

Python 435 65 Updated May 1, 2025

从零实现一个小参数量中文大语言模型。

Python 664 74 Updated Aug 22, 2024
Next
0