Stars
Applied AI experiments and examples for PyTorch
📚A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, FlashAttention, PagedAttention, MLA, Parallelism etc.
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
🤗 smolagents: a barebones library for agents that think in python code.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Efficient Triton Kernels for LLM Training
Eclipse iceoryx2™ - true zero-copy inter-process-communication in pure Rust
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Protected Auction Key/Value Service
The MLscript programming language. Functional and object-oriented; structurally typed and sound; with powerful type inference. Soon to have full interop with TypeScript!
PROPELLER: Profile Guided Optimizing Large Scale LLVM-based Relinker
Felafax is building AI infra for non-NVIDIA GPUs
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
Experimentation using the xla compiler from rust
Efficient and easy multi-instance LLM serving
A JAX research toolkit for building, editing, and visualizing neural networks.
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Meaningful control of data in distributed systems.
Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers.
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.