hanzz2007

qiuhan hanzz2007

7 followers · 99 following

china

Achievements

ScaleLLM Public
Forked from vectorch-ai/ScaleLLM

A high-performance inference system for large language models, designed for production environments.

C++ Apache License 2.0 Updated Apr 16, 2025
flux Public
Forked from bytedance/flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ Apache License 2.0 Updated Mar 12, 2025
awesome-cuda-triton-hpc Public
Forked from coderonion/awesome-cuda-and-hpc

🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.

Updated Jan 29, 2025
pdfium-binaries Public
Forked from bblanchon/pdfium-binaries

📰 Binary distribution of PDFium

Shell Updated Jan 27, 2025
heaptrack Public
Forked from KDE/heaptrack

A heap memory profiler for Linux

C++ Updated Jan 26, 2025
awesome-gemm Public
Forked from yuninxia/awesome-gemm

📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software

MIT License Updated Dec 21, 2024
albumentations Public
Forked from albumentations-team/albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125

Python 1 MIT License Updated Nov 8, 2024
Clipper2 Public
Forked from AngusJohnson/Clipper2

Polygon Clipping and Offsetting - C++, C# and Delphi

10000
C++ Boost Software License 1.0 Updated Oct 14, 2024
blahtexml Public
Forked from gvanas/blahtexml

Blahtexml

C++ Updated Jul 3, 2024
backward-cpp Public
Forked from bombela/backward-cpp

A beautiful stack trace pretty printer for C++

C++ MIT License Updated Jun 24, 2024
CRCpp Public
Forked from d-bahr/CRCpp

Easy to use and fast C++ CRC library.

C++ Other Updated Apr 23, 2024
inferflow Public
Forked from inferflow/inferflow

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

C++ MIT License Updated Jan 16, 2024
uni-algo Public
Forked from uni-algo/uni-algo

Unicode Algorithms Implementation for C/C++

C++ Other Updated Jan 5, 2024
libassert Public
Forked from jeremy-rifkin/libassert

The most over-engineered and overpowered C++ assertion library.

C++ MIT License Updated Dec 7, 2023
perf-book Public
Forked from dendibakh/perf-book

The book "Performance Analysis and Tuning on Modern CPU"

TeX Creative Commons Zero v1.0 Universal Updated Dec 4, 2023
spconv Public
Forked from traveller59/spconv

Spatial Sparse Convolution Library

Python Apache License 2.0 Updated Oct 7, 2023
MPMCQueue Public
Forked from rigtorp/MPMCQueue

A bounded multi-producer multi-consumer concurrent queue written in C++11

C++ MIT License Updated Sep 18, 2023
INT8-Flash-Attention-FMHA-Quantization Public
Forked from jundaf2/INT8-Flash-Attention-FMHA-Quantization

Cuda Updated Sep 15, 2023
flash_attention_inference Public
Forked from ShaYeBuHui01/flash_attention_inference

Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.

C++ MIT License Updated Aug 31, 2023
onnxruntime Public
Forked from microsoft/onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ MIT License Updated Aug 3, 2023
cpptrace Public
Forked from jeremy-rifkin/cpptrace

Lightweight, zero-configuration-required, and cross-platform stacktrace library for C++

C++ MIT License Updated Jul 29, 2023
How_to_optimize_in_GPU Public
Forked from Liu-xiandong/How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda Apache License 2.0 Updated Jul 29, 2023
lmdeploy Public
Forked from InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLM

C++ Apache License 2.0 Updated Jul 28, 2023
arm-gcc-inline-assembler Public
Forked from chunhuajiang/arm-gcc-inline-assembler

ARM GCC 内联汇编参考手册 - 中文版

HTML 1 Updated Jul 24, 2023
vllm Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python Apache License 2.0 Updated Jun 20, 2023
tokenizers-cpp Public
Forked from mlc-ai/tokenizers-cpp

Universal cross-platform tokenizers binding to HF and sentencepiece

C++ Apache License 2.0 Updated Jun 3, 2023
cudabmk Public
Forked from spthm/cudabmk

Source for Demystifying GPU Microarchitecture through Microbenchmarking

Cuda Updated May 29, 2023
excalidraw Public
Forked from excalidraw/excalidraw

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 1 MIT License Updated May 3, 2023
crc32 Public
Forked from Michaelangel007/crc32

CRC32 Demystified

C++ Updated Apr 23, 2023
sentry-native Public
Forked from getsentry/sentry-native

Sentry SDK for C, C++ and native applications.

C MIT License Updated Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qiuhan hanzz2007

Achievements

Achievements

Block or report hanzz2007

ScaleLLM Public

flux Public

awesome-cuda-triton-hpc Public

pdfium-binaries Public

heaptrack Public

awesome-gemm Public

albumentations Public

Clipper2 Public

blahtexml Public

backward-cpp Public

CRCpp Public

inferflow Public

uni-algo Public

libassert Public

perf-book Public

spconv Public

MPMCQueue Public

INT8-Flash-Attention-FMHA-Quantization Public

flash_attention_inference Public

onnxruntime Public

cpptrace Public

How_to_optimize_in_GPU Public

lmdeploy Public

arm-gcc-inline-assembler Public

vllm Public

tokenizers-cpp Public

cudabmk Public

excalidraw Public

crc32 Public

sentry-native Public