-
RIT-Boston
- Princeton, NJ
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Visual testing tool for MCP servers
🚀 The fast, Pythonic way to build MCP servers and clients
bespokelabsai / verifiers
Forked from willccbb/verifiersVerifiers for LLM Reinforcement Learning
Democratizing Reinforcement Learning for LLMs
Building DeepSeek R1 from Scratch
Sky-T1: Train your own O1 preview model within $450
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Synthetic data curation for post-training and structured data extraction
Fantastic Data Engineering for Large Language Models
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
A series of math-specific large language models of our Qwen2 series.
[Preprint] AIPO: Improving Training Objective for Iterative Preference Optimization
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
A course on aligning smol models.
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
The Open Cookbook for Top-Tier Code Large Language Model
[NeurIPS'24] SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓
Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"
[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct