Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 474 58 Updated Sep 11, 2024

axboe / fio

Flexible I/O Tester

C 5,649 1,304 Updated Jun 5, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 4,560 457 Updated Feb 9, 2025

LMCache / LMCache

Redis for LLMs

Python 1,288 199 Updated Jun 8, 2025

foundation-model-stack / fastsafetensors

High-performance safetensors model loader

Python 36 10 Updated Jun 6, 2025

NVIDIA / MagnumIO

Magnum IO community repo

C++ 95 17 Updated May 15, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,746 787 Updated Jun 8, 2025

coreylowman / cudarc

Safe rust wrapper around CUDA toolkit

Rust 854 101 Updated May 7, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,224 405 Updated Jun 8, 2025

gaogaotiantian / coredumpy

coredumpy saves your crash site for post-mortem debugging

Python 688 17 Updated Apr 1, 2025

gentoo / pax-utils

[MIRROR] ELF related utils for ELF 32/64 binaries that can check files for security relevant properties

C 103 26 Updated Sep 22, 2024

istio / istio

Connect, secure, control, and observe services.

Go 36,936 7,980 Updated Jun 8, 2025

envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy

C++ 26,076 4,958 Updated Jun 6, 2025

NVIDIA / gds-nvidia-fs

NVIDIA GPUDirect Storage Driver

C 250 38 Updated May 1, 2025

benfred / py-spy

Sampling profiler for Python programs

Rust 13,754 457 Updated Jun 5, 2025

bloomberg / memray

Memray is a memory profiler for Python

Python 14,022 411 Updated Jun 7, 2025

dragonflyoss / nydus

Nydus - the Dragonfly image service, providing fast, secure and easy access to container images.

Rust 1,327 222 Updated May 30, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,370 278 Updated Jun 6, 2025

EZLippi / WebBench

Webbench是Radim Kolar在1997年写的一个在linux下使用的非常简单的网站压测工具。它使用fork()模拟多个客户端同时访问我们设定的URL，测试网站在压力下工作的性能，最多可以模拟3万个并发连接去测试网站的负载能力。官网地址:http://home.tiscali.cz/~cz210552/webbench.html

C 2,718 1,145 Updated Jun 19, 2021

SergiusTheBest / plog

Portable, simple and extensible C++ logging library

C++ 2,344 403 Updated Jan 15, 2025

Jaya Yuan FENP

Highlights

Lists (10)

Book/Paper Reading

Cloud

Code Reading

High Performance Computing

Library

Machine Learning

Network

Skills

System

Tools

Starred repositories

Artificial Intelligence

Serverless

Deep learning

C++

C

Python

GitHub API

Awesome Lists