8000 rebornwwp (u) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View rebornwwp's full-sized avatar
😕
Working from home
😕
Working from home

Block or report rebornwwp

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 642 49 Updated May 5, 2025

📚 从零开始的大语言模型原理与实践教程

7,994 560 Updated Jul 4, 2025

Manually tweaked, auto-generated raylib bindings for zig. https://github.com/raysan5/raylib

Zig 1,210 183 Updated Jun 27, 2025

Plugin for generating HTML reports for pytest results

Python 738 248 Updated Jun 30, 2025

Build and run containers leveraging NVIDIA GPUs

Go 3,400 368 Updated Jul 4, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 7,267 707 Updated Jun 19, 2025

Cuda library for Zig

Zig 100 7 Updated May 30, 2025

TritonParse is a tool designed to help developers analyze and debug Triton kernels by visualizing the compilation process and source code mappings.

TypeScript 118 5 Updated Jul 4, 2025

所有小初高、大学PDF教材。

Roff 43,539 9,710 Updated May 18, 2025

AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

Python 12,324 2,317 Updated Jul 4, 2025

CUDA Matrix Multiplication Optimization

Cuda 197 21 Updated Jul 19, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 3,302 363 Updated Jul 3, 2025

Fast CUDA matrix multiplication from scratch

Cuda 761 118 Updated Dec 28, 2023

learning how CUDA works

Cuda 278 40 Updated Mar 3, 2025

更新高质量电视直播源,欢迎大家使用,永久免费

2,570 247 Updated Jul 4, 2025

Step-by-step optimization of CUDA SGEMM

Cuda 348 46 Updated Mar 30, 2022

CUDA Kernel Benchmarking Library

Cuda 675 79 Updated Jul 4, 2025

Xiao's CUDA Optimization Guide [NO LONGER ADDING NEW CONTENT]

304 20 Updated Nov 8, 2022

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 1,517 92 Updated Jul 2, 2025

Penn CIS 5650 (GPU Programming and Architecture) Final Project

C++ 34 4 Updated Dec 11, 2023

Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.

Jupyter Notebook 14,257 3,985 Updated Jan 7, 2025

Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

Shell 1,000 100 Updated Jul 29, 2024

A pre-commit hook for Ruff.

Python 1,373 65 Updated Jul 3, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,359 109 Updated Jul 4, 2025

CUDA checkpoint and restore utility

C 345 19 Updated Jan 27, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,853 279 Updated May 15, 2025

The book "Performance Analysis and Tuning on Modern CPU"

TeX 3,211 219 Updated Jun 9, 2025

AI 基础知识 - GPU 架构、CUDA 编程以及大模型基础知识

Jupyter Notebook 148 14 Updated Jun 29, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 41,546 3,308 Updated Jul 4, 2025

Python tool for converting files and office documents to Markdown.

Python 59,856 3,129 Updated Jun 4, 2025
Next
0