-
University of Wisconsin-Madison
- Madison, WI, USA
-
01:08
(UTC -05:00) - https://arist12.github.io/
Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
Stars
My learning notes/codes for ML SYS.
A final sanity checklist to help your CS paper get accepted, not desk rejected.
Revisiting Mid-training in the Era of RL Scaling
A series of math-specific large language models of our Qwen2 series.
MoBA: Mixture of Block Attention for Long-Context LLMs
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Paper list for Efficient Reasoning.
An extremely fast Python package and project manager, written in Rust.
Development repository for the Triton language and compiler
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA etc.🔥
Collection of Summer 2025 tech internships!
FlashInfer: Kernel Library for LLM Serving
📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.
Machine Learning Engineering Open Book
Ongoing research training transformer models at scale
SGLang is a fast serving framework for large language models and vision language models.
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
A tool for extracting plain text from Wikipedia dumps
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
Now we have become very big, Different from the original idea. Collect premium software in various categories.
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.