8000 ZeYang1025 (Ze Yang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View ZeYang1025's full-sized avatar

Highlights

  • Pro

Block or report ZeYang1025

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding 8000 challenges. [NeurIPS 2024]

Python 16,514 1,696 Updated Jun 30, 2025

Development repository for the Triton language and compiler

MLIR 15,999 2,080 Updated Jul 1, 2025

TritonParse is a tool designed to help developers analyze and debug Triton kernels by visualizing the compilation process and source code mappings.

TypeScript 116 3 Updated Jul 1, 2025

Code for the paper "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"

Python 89 1 Updated Jun 24, 2025

A benchmark that challenges language models to code solutions for scientific problems

Python 125 19 Updated Jun 30, 2025

Nano vLLM

Python 4,685 538 Updated Jun 27, 2025

Examples for Recommenders - easy to train and deploy on accelerated infrastructure.

Python 60 17 Updated Jun 27, 2025

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 9,216 999 Updated Jun 30, 2025

Build and deploy stateful agents across federated resources

Python 6 Updated Jul 1, 2025

Github mirror of trition-lang/triton repo.

MLIR 46 16 Updated Jun 30, 2025

[ICML'25] Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting

Python 5 1 Updated May 22, 2025
Python 2 Updated Jun 25, 2025

Browser script to share and export ChatGPT chat logs to Markdown, JSON, or as Image (PNG)

JavaScript 161 21 Updated Jun 6, 2025

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]

Python 28 1 Updated May 30, 2025

A COOL compiler, now extending to MILR

LLVM 1 Updated Feb 13, 2025

A large-scale simulation framework for LLM inference

Python 394 66 Updated Jun 25, 2025

Free, simple, fast interactive diagrams for any GitHub repository

TypeScript 13,636 984 Updated May 26, 2025

Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"

Jupyter Notebook 482 42 Updated Jun 30, 2025

Ultra and Unified CCL

C++ 267 13 Updated Jul 1, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,518 444 Updated Jun 29, 2025

🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Cuda 33 Updated Feb 23, 2024

A Datacenter Scale Distributed Inference Serving Framework

Rust 4,386 456 Updated Jul 1, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 992 67 Updated May 28, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,276 358 Updated Jul 1, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,776 1,292 Updated Jun 27, 2025

This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.

C++ 177 68 Updated Jun 30, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 592 107 Updated Jul 1, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA.🎉

Cuda 5,210 553 Updated Jun 29, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,185 289 Updated Jun 30, 2025

Reproducing R1 for Code with Reliable Rewards

Python 223 17 Updated May 5, 2025
Next
0