8000 czhxiaohuihui / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View czhxiaohuihui's full-sized avatar
  • Shang hai

Block or report czhxiaohuihui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
C++ 73 12 Updated May 16, 2025

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library

Assembly 97 135 Updated Jun 1, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 548 98 Updated Jun 1, 2025
Python 70 3 Updated May 19, 2025

100 days of building GPU kernels!

Cuda 430 43 Updated Apr 27, 2025

Design Patterns In Modern C++ 中文版翻译

C++ 527 85 Updated Oct 3, 2021

Efficient Triton Kernels for LLM Training

Python 5,122 341 Updated Jun 1, 2025

LLM Inference benchmark

Python 419 39 Updated Jul 23, 2024

LLM全栈优质资源汇总

Shell 565 67 Updated Nov 25, 2024

Helpful tools and examples for working with flex-attention

Python 805 47 Updated May 30, 2025

Penn CIS 5650 (GPU Programming and Architecture) Final Project

C++ 31 4 Updated Dec 11, 2023

看图学大模型

Python 302 19 Updated Jul 30, 2024

High-Performance C++ Fundamental Library

C++ 574 80 Updated Dec 17, 2024
Python 193 25 Updated May 5, 2025

Llama3-Tutorial(XTuner、LMDeploy、OpenCompass)

Python 508 53 Updated May 10, 2024

Make triton easier

Python 47 Updated Jun 12, 2024

Supporting PyTorch models with the Google AI Edge TFLite runtime.

Jupyter Notebook 613 83 Updated May 31, 2025

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 123 38 Updated Jan 2, 2025

CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.

Cuda 183 5 Updated Apr 15, 2025

learning how CUDA works

Cuda 264 36 Updated Mar 3, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 14,982 1,268 Updated May 23, 2024

Guidelines Support Library

C++ 6,421 750 Updated May 22, 2025

Xiao's CUDA Optimization Guide [Active Adding New Contents]

297 20 Updated Nov 8, 2022

GPU programming related news and material links

1,534 88 Updated Jan 6, 2025

Material for gpu-mode lectures

Jupyter Notebook 4,514 452 Updated Feb 9, 2025

【LLMs九层妖塔】分享 LLMs在自然语言处理(ChatGLM、Chinese-LLaMA-Alpaca、小羊驼 Vicuna、LLaMA、GPT4ALL等)、信息检索(langchain)、语言合成、语言识别、多模态等领域(Stable Diffusion、MiniGPT-4、VisualGLM-6B、Ziya-Visual等)等 实战与经验。

2,059 202 Updated Mar 30, 2024

PyTorch extensions for high performance and large scale training.

Python 3,328 288 Updated Apr 26, 2025

The official Meta Llama 3 GitHub site

Python 28,753 3,394 Updated Jan 26, 2025

LLM training in simple, raw C/CUDA

Cuda 26,757 3,071 Updated May 10, 2025

JSON for Modern C++

C++ 45,878 7,008 Updated May 31, 2025
Next
0