Stars
GRPO Training Script for Qwen Model on GSM8K Dataset. This script trains a Qwen model using the GRPO (Generalized Reinforcement Policy Optimization) method on the GSM8K (Generalized Math 8K) datase…
A simple example of using GRPO (Group Relative Policy Optimization) trainer from TRL library to fine-tune Qwen2.5-1.5B model on the TL;DR summarization dataset.
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)
General technology for enabling AI capabilities w/ LLMs and MLLMs
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
0-hero / vllm-experiments
Forked from vllm-project/vllmOfficial VLLM Implementation is available for mixtral. This version supports the DiscoLM mixtral models
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
A series of large language models trained from scratch by developers @01-ai
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A series of large language models developed by Baichuan Intelligent Technology
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Kanchil(鼷鹿)是世界上最小的偶蹄目动物,这个开源项目意在探索小模型(6B以下)是否也能具备和人类偏好对齐的能力。
🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
中文法律LLaMA (LLaMA for Chinese legel domain)
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation