Highlights
- Pro
Stars
Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
Minimal and annotated implementations of key ideas from modern deep learning research.
A high-throughput and memory-efficient inference and serving engine for LLMs
the AI-native open-source embedding database
Experimental WASM Microkernel Operating System
Large Language Model (LLM) Systems Paper List
A minimal GPU design in Verilog to learn how GPUs work from the ground up
The book "Performance Analysis and Tuning on Modern CPU"
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Tensors and Dynamic neural networks in Python with strong GPU acceleration
A machine learning compiler for GPUs, CPUs, and ML accelerators
Docker for running stroke lesion core segmentation
Modern C++ Programming Cookbook, Third Edition, Published by PACKT
A Toolchain to make Build and Run eBPF programs easier
A Datacenter Scale Distributed Inference Serving Framework