8000 StudyingShao (NVJiangShao) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View StudyingShao's full-sized avatar
😅
😅

Block or report StudyingShao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

CUDA Templates for Linear Algebra Subroutines

C++ 2 Updated Apr 1, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,549 1,235 Updated May 15, 2025

A PyTorch Toolbox for Grouped GEMM in MoE Model Training

5 1 Updated May 28, 2024

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,415 422 Updated May 17, 2025

The Triton TensorRT-LLM Backend

Python 834 124 Updated May 16, 2025

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 121 37 Updated Jan 2, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 3 Updated May 19, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,514 1,434 Updated May 19, 2025

A plugin to use Nvidia GPU in PySCF package

Cuda 201 36 Updated May 16, 2025

Main Web Site (Online Books)

HTML 9,453 918 Updated Apr 28, 2025

面向开发人员梳理的代码安全指南

13,436 1,940 Updated Mar 20, 2023
0