8000 zifeitong / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View zifeitong's full-sized avatar

Block or report zifeitong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Guidelines Support Library

C++ 6,399 747 Updated Mar 27, 2025

Asynchronous gRPC with Asio/unified executors

C++ 409 37 Updated May 8, 2025

An Extensible Deep Learning Library

Python 2,040 319 Updated May 10, 2025

Applied AI experiments and examples for PyTorch

Python 264 27 Updated Apr 28, 2025

Multi-GPU CUDA stress test

C++ 1,681 322 Updated Aug 20, 2024

Where GPUs get cooked 👩‍🍳🔥

Rust 228 12 Updated Mar 4, 2025
Python 109 14 Updated May 8, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.

Python 3,966 276 Updated May 6, 2025

NanoGPT (124M) in 3 minutes

Python 2,535 296 Updated Apr 26, 2025

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 719 59 Updated Jan 21, 2025

🤗 smolagents: a barebones library for agents that think in python code.

Python 18,322 1,589 Updated May 9, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,108 82 Updated May 9, 2025

Efficient Triton Kernels for LLM Training

Python 4,987 316 Updated May 8, 2025

Optimizing inference proxy for LLMs

Python 2,213 172 Updated May 7, 2025

Eclipse iceoryx2™ - true zero-copy inter-process-communication in pure Rust

Rust 1,426 62 Updated May 5, 2025

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,084 159 Updated Mar 26, 2025

Protected Auction Key/Value Service

C++ 61 24 Updated May 9, 2025

The MLscript programming language. Functional and object-oriented; structurally typed and sound; with powerful type inference. Soon to have full interop with TypeScript!

Scala 187 32 Updated May 1, 2025

PROPELLER: Profile Guided Optimizing Large Scale LLVM-based Relinker

C++ 411 40 Updated May 9, 2025

Felafax is building AI infra for non-NVIDIA GPUs

Jupyter Notebook 560 35 Updated Jan 24, 2025

Sparse nonlinear least squares in JAX

Python 199 13 Updated May 8, 2025

Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild

Zig 2,248 80 Updated May 9, 2025

Experimentation using the xla compiler from rust

Rust 92 16 Updated Aug 17, 2024

Efficient and easy multi-instance LLM serving

Python 402 31 Updated May 9, 2025

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,779 62 Updated Apr 26, 2025

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 701 125 Updated Feb 21, 2025

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 606 47 Updated May 5, 2025

Meaningful control of data in distributed systems.

Rust 1,361 121 Updated May 9, 2025

Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers.

C++ 434 54 Updated May 2, 2025

Deep learning for dummies. All the practical details and useful utilities that go into working with real models.

Python 791 40 Updated Apr 30, 2025
Next
0