Stars
High-speed and easy-use LLM serving framework for local deployment
💻 A better and friendly vi(vim) mode plugin for ZSH.
[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
⏩ Create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Swift Package to implement a transformers-like API in Swift
Everything we actually know about the Apple Neural Engine (ANE)
llm deploy project based mnn. This project has merged into MNN.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
[ICLR 2024] Lemur: Open Foundation Models for Language Agents
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".
Anonymous Github is a proxy server to support anonymous browsing of Github repositories for open-science code and data.
[EMNLP 2022] TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data
Training and serving large-scale neural networks with auto parallelization.
pku-liang / Sanger
Forked from hatsu3/SangerA co-design architecture on sparse attention
Showcase for rankit http://github.com/wattlebird/ranking/
A python library for fractional fixed-point (base 2) arithmetic and binary manipulation with Numpy compatibility.
🚀 A very efficient Texas Holdem GTO solver
Raytracer tutorial for PPCA 2021, written in Rust.