8000 Denverzyl / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Denverzyl's full-sized avatar

Block or report Denverzyl

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,056 230 Updated Jun 26, 2025

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 520 53 Updated May 27, 2025

a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA environments.

Python 43 6 Updated Aug 25, 2024

HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo

Python 1,535 144 Updated May 20, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 10,482 953 Updated Jun 3, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 29,518 6,067 Updated Jun 27, 2025

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 908 37 Updated Jun 8, 2025

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 29,575 3,649 Updated Jul 23, 2024

Fast and memory-efficient exact attention

Python 18,047 1,772 Updated Jun 25, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,510 1,525 Updated Jun 13, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,520 2,204 Updated Jun 27, 2025

Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models

Python 185 24 Updated Jun 25, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 1 Updated Mar 12, 2025

AI Tensor Engine for ROCm

Python 210 59 Updated Jun 27, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,840 279 Updated May 15, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,481 630 Updated Jun 23, 2025

[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding

Python 116 9 Updated Dec 4, 2024

DeepEP: an efficient expert-parallel communication library

Cuda 8,217 822 Updated Jun 27, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,628 871 Updated Apr 29, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,461 1,029 Updated Jun 26, 2025

NVIDIA Linux open GPU with P2P support

C 1,178 116 Updated Jun 6, 2025

Computer science books Recommended by AzatAI. (Education ONLY)

Python 1,016 286 Updated Sep 27, 2023

IIMS College AI class of batch 2022

Jupyter Notebook 138 58 Updated Jul 16, 2023

Machine Learning Resources, Practice and Research

Python 4,222 1,567 Updated Jun 26, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,344 264 Updated Jun 27, 2025

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 110,294 17,930 Updated Jun 27, 2025
Python 213 17 Updated Jan 23, 2025

Accelerate inference without tears

Python 319 21 Updated Mar 14, 2025

REST: Retrieval-Based Speculative Decoding, NAACL 2024

C 204 15 Updated Dec 2, 2024

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,805 239 Updated Jun 24, 2025
Next
0