-
19:18
(UTC +08:00)
Highlights
- Pro
-
-
-
-
-
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
MLIR MIT License UpdatedJun 22, 2025 -
Liger-Kernel Public
Forked from linkedin/Liger-KernelEfficient Triton Kernels for LLM Training
Python BSD 2-Clause "Simplified" License UpdatedJun 19, 2025 -
FlagGems-dly Public
Forked from FlagOpen/FlagGemsFlagGems is an operator library for large language models implemented in the Triton Language.
Python Apache License 2.0 UpdatedJun 7, 2025 -
Aries Public
Forked from arc-research-lab/AriesARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines
C++ UpdatedJun 5, 2025 -
iree-amd-aie-dly Public
Forked from nod-ai/iree-amd-aieIREE plugin repository for the AMD AIE accelerator
MLIR Apache License 2.0 UpdatedJun 5, 2025 -
-
-
taskflow-dly Public
Forked from taskflow/taskflowA General-purpose Task-parallel Programming System using Modern C++
C++ Other UpdatedJun 2, 2025 -
Stream-HLS-dly Public
Forked from UCLA-VAST/Stream-HLSAn MLIR Complier for PyTorch/C/C++ Codes into HLS Dataflow Designs
MLIR MIT License UpdatedMay 20, 2025 -
-
onnxruntime Public
Forked from microsoft/onnxruntimeONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
C++ MIT License UpdatedMay 10, 2025 -
fastllm Public
Forked from ztxz16/fastllmfastllm是c++实现,后端无依赖(仅依赖CUDA,无需依赖PyTorch)的高性能大模型推理库。 可实现单4090推理DeepSeek R1 671B INT4模型,单路可达20+tps。
-
byteir Public
Forked from bytedance/byteirA model compilation solution for various hardware
MLIR Apache License 2.0 UpdatedMay 8, 2025 -
-
micro-polyaie Public
Forked from hanchenye/polyaieAn MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE
C++ Other UpdatedApr 20, 2025 -
tvm-dly Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedApr 15, 2025 -
tpu-mlir Public
Forked from sophgo/tpu-mlirMachine learning compiler based on MLIR for Sophgo TPU.
C++ Other UpdatedApr 12, 2025 -
BladeDISC Public
Forked from alibaba/BladeDISCBladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
C++ Apache License 2.0 UpdatedApr 11, 2025 -
allo Public
Forked from cornell-zhang/alloAllo: A Programming Model for Composable Accelerator Design
Python Apache License 2.0 UpdatedApr 2, 2025 -
buddy-mlir Public
Forked from buddy-compiler/buddy-mlirAn MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).
C++ Apache License 2.0 UpdatedApr 2, 2025 -
-
Module-0 Public template
Forked from minitorch/Module-0Module 0 - Fundamentals
Python UpdatedMar 2, 2025 -
-
torch-mlir Public
Forked from llvm/torch-mlirThe Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
-
xla Public
Forked from openxla/xlaA machine learning compiler for GPUs, CPUs, and ML accelerators
C++ Apache License 2.0 UpdatedFeb 9, 2025 -
Halide Public
Forked from halide/Halidea language for fast, portable data-parallel computation
C++ Other UpdatedJan 30, 2025