8000 zhuohan123 (Zhuohan Li) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View zhuohan123's full-sized avatar

Organizations

@alpa-projects @vllm-project

Block or report zhuohan123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A PyTorch native platform for training generative AI models

Python 3,822 375 Updated May 21, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,185 92 Updated May 21, 2025

A program to read, merge, and write programs for the Breville Control °Freak®

Java 23 1 Updated Dec 31, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 90,186 24,223 Updated May 22, 2025

The Startup CTO's Handbook, a book covering leadership, management and technical topics for leaders of software engineering teams

13,638 754 Updated Mar 19, 2025
Python 100 4 Updated Mar 20, 2025

Load compute kernels from the Hub

Python 130 7 Updated May 21, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,905 884 Updated May 21, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,367 598 Updated May 20, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,778 277 Updated May 15, 2025

common in-memory tensor structure

C++ 992 147 Updated May 12, 2025

The best OSS video generation models

Python 3,168 353 Updated Jan 8, 2025

Manipulating Python Programs

Python 661 29 Updated May 16, 2025

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 1,371 131 Updated May 22, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 810 37 Updated May 10, 2025

Dynamic Memory Management for Serving LLMs without PagedAttention

C 376 30 Updated Apr 18, 2025

A framework for few-shot evaluation of language models.

Python 8,983 2,403 Updated May 22, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 939 60 Updated Apr 15, 2025

Blazingly fast LLM inference.

Rust 5,622 403 Updated May 22, 2025

🙌 OpenHands: Code Less, Make More

Python 54,772 6,184 Updated May 22, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 72 97 Updated May 22, 2025

Tile primitives for speedy kernels

Cuda 2,362 142 Updated May 21, 2025

A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。

JavaScript 38,880 4,767 Updated May 20, 2025

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 167 9 Updated Oct 15, 2024

Arena-Hard-Auto: An automatic LLM benchmark.

Python 830 101 Updated May 1, 2025
Python 3,509 343 Updated May 13, 2025

CUDA/Metal accelerated language model inference

C 577 25 Updated Apr 10, 2025

DSPy: The framework for programming—not prompting—language models

Python 24,407 1,888 Updated May 22, 2025

A parallel framework for training deep neural networks

Python 60 5 Updated Mar 16, 2025

[ICML 2024] CLLMs: Consistency Large Language Models

Python 391 16 Updated Nov 16, 2024
Next
0