LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,223 254 Updated May 16, 2025

bruno686 / Awesome-RL-based-LLM-Reasoning

Awesome RL-based LLM Reasoning

490 25 Updated May 4, 2025

srush / Triton-Puzzles

Puzzles for learning Triton

Jupyter Notebook 1,628 131 Updated Nov 18, 2024

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA etc.🔥

Cuda 4,244 453 Updated May 12, 2025

SimplifyJobs / Summer2025-Internships

Collection of Summer 2025 tech internships!

37,708 2,907 Updated May 17, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 2,964 307 Updated May 16, 2025

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.

Python 4,003 277 Updated May 15, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 4,441 448 Updated Feb 9, 2025

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 13,730 828 Updated May 8, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,361 2,766 Updated May 16, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 14,405 1,769 Updated May 17, 2025

texttron / tevatron

Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.

Python 599 106 Updated May 12, 2025

attardi / wikiextractor

A tool for extracting plain text from Wikipedia dumps

Python 3,857 979 Updated May 23, 2024

unslothai / unsloth

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 38,810 3,038 Updated May 17, 2025

jaywcjlove / awesome-mac

 Now we have become very big, Different from the original idea. Collect premium software in various categories.

JavaScript 82,935 6,519 Updated May 13, 2025

naver / splade

SPLADE: sparse neural search (SIGIR21, SIGIR22)

Python 843 91 Updated May 3, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 26,606 3,058 Updated May 10, 2025

beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Python 1,806 206 Updated Feb 25, 2025

FranxYao / Long-Context-Data-Engineering

Implementation of paper Data Engineering for Scaling Language Models to 128K Context

Python 461 30 Updated Mar 19, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,485 531 Updated May 3, 2024

Yikai Zhang Arist12

Highlights

Lists (4)

Attention‼️

Benchmarks 🥇

Library 📚

Turorials 📖

Stars