Stars
Data and tools for generating and inspecting OLMo pre-training data.
Knowledge transfer from high-resource to low-resource programming languages for Code LLMs
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
High level asynchronous concurrency and networking framework that works on top of either trio or asyncio
Aidan Bench attempts to measure <big_model_smell> in LLMs.
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
Helpful tools and examples for working with flex-attention
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
A PyTorch native platform for training generative AI models
Ring attention implementation with flash attention
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.
go-trafilatura is a Go port of the trafilatura Python library.
Qodo-Cover: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞
Arena-Hard-Auto: An automatic LLM benchmark.