8000 blossomin (CHEN Dong) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View blossomin's full-sized avatar

Highlights

  • Pro

Block or report blossomin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Distributed KV cache coordinator

Go 39 12 Updated Jun 26, 2025

A KV storage engine based on LSM Tree, supporting Redis RESP

C++ 183 25 Updated Jun 24, 2025

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 1,291 71 Updated Jun 30, 2025

Cataloging released Triton kernels.

240 12 Updated Jan 10, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,817 298 Updated Mar 10, 2025
Python 74 8 Updated Apr 2, 2025

GPU programming related news and material links

1,602 89 Updated Jan 6, 2025

Perplexity GPU Kernels

C++ 380 46 Updated Jun 10, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,378 455 Updated Jun 30, 2025

Material for gpu-mode lectures

Jupyter Notebook 4,655 468 Updated Jun 18, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 992 67 Updated May 28, 2025

KV cache store for distributed LLM inference

C++ 278 28 Updated Jun 6, 2025

Efficient Mixture of Experts for LLM Paper List

Python 79 3 Updated Dec 15, 2024

📰 Must-read papers and blogs on Speculative Decoding ⚡️

814 45 Updated Jun 22, 2025

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 1,417 217 Updated Jun 30, 2025

Evaluation code for confidential virtual machines (AMD SEV-SNP / Intel TDX)

Python 10 3 Updated Apr 23, 2025

Advanced Privacy-Preserving Federated Learning framework

Python 140 25 Updated Jun 27, 2025

Curated collection of papers in MoE model inference

203 8 Updated Feb 19, 2025
Python 25 11 Updated May 19, 2025

A high-performance inference system for large language models, designed for production environments.

C++ 449 37 Updated Jun 25, 2025

Private Cloud Compute (PCC)

Swift 829 79 Updated Apr 11, 2025

Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models

1,234 73 Updated Feb 24, 2025

Awesome LLMs on Device: A Comprehensive Survey

1,137 104 Updated Jan 12, 2025

A curated list for Efficient Large Language Models

Python 1,756 140 Updated Jun 17, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 10,161 1,680 Updated Jun 30, 2025

MNPWAD: Multi-Normal Prototypes Learning for Weakly Supervised Anomaly Detection

Python 3 Updated Jun 19, 2025

LLM KV cache compression made easy

Python 523 42 Updated Jun 20, 2025
Next
0