Highlights
- Pro
Lists (10)
Sort Name ascending (A-Z)
Starred repositories
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
Fast and memory-efficient exact attention
FlashInfer: Kernel Library for LLM Serving
PyTorch native quantization and sparsity for training and inference
FlashMLA: Efficient MLA decoding kernels
SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
High-performance safetensors model loader
DeepEP: an efficient expert-parallel communication library
A Datacenter Scale Distributed Inference Serving Framework
coredumpy saves your crash site for post-mortem debugging
[MIRROR] ELF related utils for ELF 32/64 binaries that can check files for security relevant properties
Cloud-native high-performance edge/middle/service proxy
Nydus - the Dragonfly image service, providing fast, secure and easy access to container images.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Webbench是Radim Kolar在1997年写的一个在linux下使用的非常简单的网站压测工具。它使用fork()模拟多个客户端同时访问我们设定的URL,测试网站在压力下工作的性能,最多可以模拟3万个并发连接去测试网站的负载能力。官网地址:http://home.tiscali.cz/~cz210552/webbench.html
Portable, simple and extensible C++ logging library