wangfakang

sky wangfakang

208 followers · 72 following

Alibaba
HangZhou
18:53 (UTC +08:00)
http://wangfakang.github.io

Achievements

Organizations

Lists (3)

Sort

Stars

Victarry / PP-Schedule-Visualization

Pipeline Parallelism Emulation and Visualization

Python 45 3 Updated Jun 12, 2025

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 803 68 Updated Jun 3, 2025

Infrawaves / DeepEP_ibrc_dual-ports_multiQP

Aims to implement dual-port and multi-qp solutions in deepEP ibrc transport

Cuda 51 2 Updated May 9, 2025

ppl-ai / pplx-kernels

Perplexity GPU Kernels

C++ 377 44 Updated Jun 10, 2025

ByteDance-Seed / SDP4Bit

official implementation of paper SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training

Python 38 7 Updated Dec 11, 2024

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,840 279 Updated May 15, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,208 820 Updated Jun 26, 2025

nv-morpheus / Morpheus

Morpheus SDK

Python 490 177 Updated Jun 25, 2025

microsoft / SimMIM

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Python 984 97 Updated Sep 29, 2022

SJTU-IPADS / PhoenixOS

Fast OS-level support for GPU checkpoint and restore

C++ 199 20 Updated Jun 18, 2025

aliyun / aicb

HTML 203 36 Updated May 30, 2025

NVIDIA / cuda-checkpoint

CUDA checkpoint and restore utility

C 345 19 Updated Jan 27, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,854 1,526 Updated Jun 26, 2025

PrimeIntellect-ai / prime

prime is a framework for efficient, globally distributed training of AI models over the internet.

Python 771 82 Updated May 22, 2025

aliyun / SimCCL

42 3 Updated Nov 5, 2024

singalen / logrotee

tee-like program that tee-s stdin to a rotated log file(s) and can compress them.

C++ 13 5 Updated Jan 28, 2018

NVIDIA / nvidia-resiliency-ext

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 179 25 Updated Jun 7, 2025