8000 FENP (Jaya Yuan) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View FENP's full-sized avatar
😪
😪
  • Huazhong University of Science and Technology

Highlights

  • Pro

Block or report FENP

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

LLM KV cache compression made easy

Python 499 40 Updated Jun 6, 2025

Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme

TypeScript 6,483 591 Updated Jun 8, 2025

Fast and memory-efficient exact attention

Python 17,717 1,722 Updated Jun 8, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,652 1,256 Updated Jun 7, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,123 324 Updated Jun 8, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,090 280 Updated Jun 7, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,591 838 Updated Apr 29, 2025

SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.

C++ 1,726 67 Updated Apr 14, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 390 91 Updated Jun 6, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,414 611 Updated May 27, 2025

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 474 58 Updated Sep 11, 2024

Flexible I/O Tester

C 5,649 1,304 Updated Jun 5, 2025

Material for gpu-mode lectures

Jupyter Notebook 4,560 457 Updated Feb 9, 2025

Redis for LLMs

Python 1,288 199 Updated Jun 8, 2025

High-performance safetensors model loader

Python 36 10 Updated Jun 6, 2025

Magnum IO community repo

C++ 95 17 Updated May 15, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,746 787 Updated Jun 8, 2025

Safe rust wrapper around CUDA toolkit

Rust 854 101 Updated May 7, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,224 405 Updated Jun 8, 2025

coredumpy saves your crash site for post-mortem debugging

Python 688 17 Updated Apr 1, 2025

[MIRROR] ELF related utils for ELF 32/64 binaries that can check files for security relevant properties

C 103 26 Updated Sep 22, 2024

Connect, secure, control, and observe services.

Go 36,936 7,980 Updated Jun 8, 2025

Cloud-native high-performance edge/middle/service proxy

C++ 26,076 4,958 Updated Jun 6, 2025

NVIDIA GPUDirect Storage Driver

C 250 38 Updated May 1, 2025

Sampling profiler for Python programs

Rust 13,754 457 Updated Jun 5, 2025

Memray is a memory profiler for Python

Python 14,022 411 Updated Jun 7, 2025

Nydus - the Dragonfly image service, providing fast, secure and easy access to container images.

Rust 1,327 222 Updated May 30, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,370 278 Updated Jun 6, 2025

Webbench是Radim Kolar在1997年写的一个在linux下使用的非常简单的网站压测工具。它使用fork()模拟多个客户端同时访问我们设定的URL,测试网站在压力下工作的性能,最多可以模拟3万个并发连接去测试网站的负载能力。官网地址:http://home.tiscali.cz/~cz210552/webbench.html

C 2,718 1,145 Updated Jun 19, 2021

Portable, simple and extensible C++ logging library

C++ 2,344 403 Updated Jan 15, 2025
Next
0