Highlights
- Pro
Lists (5)
Sort Name ascending (A-Z)
Stars
Large Language Model (LLM) Systems Paper List
Distributed Triton for Parallel Systems
A Datacenter Scale Distributed Inference Serving Framework
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…
No fortress, purely open ground. OpenManus is Coming.
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Python tool for converting files and office documents to Markdown.
A lightweight data processing framework built on DuckDB and 3FS.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Analyze computation-communication overlap in V3/R1.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient MLA decoding kernels
Official Repo for Open-Reasoner-Zero
A throughput-oriented high-performance serving framework for LLMs
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Fully open reproduction of DeepSeek-R1
Janus-Series: Unified Multimodal Understanding and Generation Models
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
A general and accurate MACs / FLOPs profiler for PyTorch models
2025中国翻墙软件VPN推荐以及科学上网避坑,稳定好用。对比SSR机场、蓝灯、V2ray、老王VPN、VPS搭建梯子等科学上网与翻墙软件,中国最新科学上网翻墙梯子VPN下载推荐,访问Chatgpt。
A simple, performant and scalable Jax LLM!
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos