8000 laixinn (laixin) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View laixinn's full-sized avatar

Highlights

  • Pro

Block or report laixinn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

KV cache store for distributed LLM inference

C++ 191 20 Updated May 14, 2025

A lightweight design for computation-communication overlap.

Cuda 109 2 Updated May 6, 2025

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 493 42 Updated Apr 21, 2025

The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)

HTML 232 3 Updated Jan 5, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA etc.🔥

Cuda 4,195 449 Updated May 12, 2025

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 13 3 Updated Apr 26, 2025

AGE animation official website URL release page(AGE动漫官网网址发布页)

1 Updated Apr 21, 2024
Python 58 2 Updated Apr 26, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,251 245 Updated May 14, 2025

Nvidia Instruction Set Specification Generator

Python 260 11 Updated Jul 9, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,218 254 Updated May 14, 2025

Distributed Triton for Parallel Systems

Python 692 43 Updated May 12, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,682 652 Updated May 13, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,337 155 Updated Mar 20, 2025

Large Language Model (LLM) Systems Paper List

1,221 68 Updated May 10, 2025

Auto Switch Rule for SwitchyOmega

JavaScript 590 55 Updated Mar 7, 2025

RussellGroupCV is a resume template made by following the guidelines followed by the Russell Group in the UK.

TeX 148 34 Updated Feb 16, 2024

Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard

JavaScript 11 1 Updated Dec 14, 2024

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,763 276 Updated Apr 14, 2025

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.

Python 121 13 Updated Apr 7, 2025

一种任务级GPU算力分时调度的高性能深度学习训练平台

Python 641 88 Updated Oct 24, 2023
Jupyter Notebook 149 8 Updated Mar 12, 2024
C# 13 1 Updated Jun 23, 2023

深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)

8,230 1,359 Updated Apr 24, 2024

leaked prompts of GPTs

29,806 4,044 Updated Sep 27, 2024

A debugging and profiling tool that can trace and visualize python code execution

Python 6,539 434 Updated May 14, 2025

Yuhong Luo and Pan Li. Neighborhood-aware scalable temporal network representation learning. In Learning on Graphs, 2022.

Python 27 5 Updated May 6, 2023

📃 White paper for Backend developers

3,299 316 Updated Feb 2, 2025

A list of awesome GNN systems.

Python 314 26 Updated May 14, 2025
Next
0