- Beijing
-
CUDA-Learn-Notes Public
Forked from DefTruth/CUDA-Learn-Notes📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
-
-
-
DeepSeek-VL2 Public
Forked from deepseek-ai/DeepSeek-VL2DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
-
-
xla Public
Forked from openxla/xlaA machine learning compiler for GPUs, CPUs, and ML accelerators
-
dify Public
Forked from langgenius/difyDify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
TypeScript Other UpdatedOct 23, 2024 -
lolcats Public
Forked from HazyResearch/lolcatsRepo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
Python Apache License 2.0 UpdatedOct 16, 2024 -
-
swarm Public
Forked from openai/swarmEducational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Python MIT License UpdatedOct 13, 2024 -
Vitis-AI Public
Forked from Xilinx/Vitis-AIVitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
Python Apache License 2.0 UpdatedSep 12, 2024 -
kserve Public
Forked from kserve/kserveStandardized Serverless ML Inference Platform on Kubernetes
Python Apache License 2.0 UpdatedAug 30, 2024 -
transfusion-pytorch Public
Forked from lucidrains/transfusion-pytorchPytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
MIT License UpdatedAug 23, 2024 -
marlin Public
Forked from IST-DASLab/marlinFP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Python Apache License 2.0 UpdatedAug 15, 2024 -
segment-anything-2 Public
Forked from facebookresearch/sam2The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Jupyter Notebook Apache License 2.0 UpdatedJul 31, 2024 -
vattention Public
Forked from microsoft/vattentionDynamic Memory Management for Serving LLMs without PagedAttention
C MIT License UpdatedJul 22, 2024 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedJul 16, 2024 -
ScaleLLM Public
Forked from vectorch-ai/ScaleLLMA high-performance inference system for large language models, designed for production environments.
C++ Apache License 2.0 UpdatedJul 3, 2024 -
lectures Public
Forked from gpu-mode/lecturesMaterial for cuda-mode lectures
Jupyter Notebook Apache License 2.0 UpdatedJun 13, 2024 -
LLMBench Public
A library for validating and benchmarking LLMs inference.
Python Apache License 2.0 UpdatedJun 7, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedMay 10, 2024 -
LLaVA Public
Forked from haotian-liu/LLaVA[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Python Apache License 2.0 UpdatedMay 7, 2024 -
-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedMar 7, 2024 -
mlc-llm Public
Forked from mlc-ai/mlc-llmEnable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Python Apache License 2.0 UpdatedMar 6, 2024 -
llama2.c Public
Forked from karpathy/llama2.cInference Llama 2 in one file of pure C
C MIT License UpdatedFeb 25, 2024 -
-
-
cuda_hgemm Public
Forked from Bruce-Lee-LY/cuda_hgemmSeveral optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Cuda MIT License UpdatedNov 7, 2023 -
OpenAgents Public
Forked from xlang-ai/OpenAgentsOpenAgents: An Open Platform for Language Agents in the Wild
Python Apache License 2.0 UpdatedOct 24, 2023