micropuma

Leon micropuma

Undergraduate student at Beijing University of Posts and Telecommunications

7 followers · 87 following

19:18 (UTC +08:00)

Highlights

mlir-practice Public

MLIR 1 Updated Jun 27, 2025
triton-tutorial Public

MLIR MIT License Updated Jun 26, 2025
micropuma.github.io Public

HTML 2 Updated Jun 25, 2025
Pytorch-Sample Public

Python Updated Jun 25, 2025
triton Public
Forked from triton-lang/triton

Development repository for the Triton language and compiler

MLIR MIT License Updated Jun 22, 2025
Liger-Kernel Public
Forked from linkedin/Liger-Kernel

Efficient Triton Kernels for LLM Training

Python BSD 2-Clause "Simplified" License Updated Jun 19, 2025
FlagGems-dly Public
Forked from FlagOpen/FlagGems

FlagGems is an operator library for large language models implemented in the Triton Language.

Python Apache License 2.0 Updated Jun 7, 2025
Aries Public
Forked from arc-research-lab/Aries

ARIES: An Agile MLIR-Based Compilation Flow for Reconfigurable Devices with AI Engines

C++ Updated Jun 5, 2025
iree-amd-aie-dly Public
Forked from nod-ai/iree-amd-aie

IREE plugin repository for the AMD AIE accelerator

MLIR Apache License 2.0 Updated Jun 5, 2025
AIE-E2E Public

Shell Updated Jun 5, 2025
cuda-course Public

Cuda Updated Jun 2, 2025
taskflow-dly Public
Forked from taskflow/taskflow

A General-purpose Task-parallel Programming System using Modern C++

C++ Other Updated Jun 2, 2025
Stream-HLS-dly Public
Forked from UCLA-VAST/Stream-HLS

An MLIR Complier for PyTorch/C/C++ Codes into HLS Dataflow Designs

MLIR MIT License Updated May 20, 2025
iree-dly Public

C++ Apache License 2.0 Updated May 20, 2025
onnxruntime Public
Forked from microsoft/onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ MIT License Updated May 10, 2025
fastllm Public
Forked from ztxz16/fastllm

fastllm是c++实现，后端无依赖（仅依赖CUDA，无需依赖PyTorch）的高性能大模型推理库。可实现单4090推理DeepSeek R1 671B INT4模型，单路可达20+tps。

C++ 1 Apache License 2.0 Updated May 9, 2025
byteir Public
Forked from bytedance/byteir

A model compilation solution for various hardware

MLIR Apache License 2.0 Updated May 8, 2025
toy-project Public

C++ 1 Updated Apr 25, 2025
micro-polyaie Public
Forked from hanchenye/polyaie

An MLIR-based compiler from C/C++ to AMD-Xilinx Versal AIE

C++ Other Updated Apr 20, 2025
tvm-dly Public
Forked from apache/tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python Apache License 2.0 Updated Apr 15, 2025
tpu-mlir Public
Forked from sophgo/tpu-mlir

Machine learning compiler based on MLIR for Sophgo TPU.

C++ Other Updated Apr 12, 2025
BladeDISC Public
Forked from alibaba/BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ Apache License 2.0 Updated Apr 11, 2025
allo Public
Forked from cornell-zhang/allo

Allo: A Programming Model for Composable Accelerator Design

Python Apache License 2.0 Updated Apr 2, 2025
buddy-mlir Public
Forked from buddy-compiler/buddy-mlir

An MLIR-based compiler framework bridges DSLs (domain-specific languages) to DSAs (domain-specific architectures).

C++ Apache License 2.0 Updated Apr 2, 2025
mlir-air Public
Forked from Xilinx/mlir-air

MLIR MIT License Updated Mar 6, 2025
Module-0 Public template
Forked from minitorch/Module-0

Module 0 - Fundamentals

Python Updated Mar 2, 2025
llmsys_s24_hw1 Public
Forked from llmsystem/llmsys_s24_hw1

Python MIT License Updated Mar 2, 2025
torch-mlir Public
Forked from llvm/torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1 Other Updated Feb 13, 2025
xla Public
Forked from openxla/xla

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ Apache License 2.0 Updated Feb 9, 2025
Halide Public
Forked from halide/Halide

a language for fast, portable data-parallel computation

C++ Other Updated Jan 30, 2025

Leon micropuma

Highlights

mlir-practice Public

Uh oh!

triton-tutorial Public

Uh oh!

micropuma.github.io Public

Uh oh!

Pytorch-Sample Public

Uh oh!

triton Public

Uh oh!

Liger-Kernel Public

Uh oh!

FlagGems-dly Public

Uh oh!

Aries Public

Uh oh!

iree-amd-aie-dly Public

Uh oh!

AIE-E2E Public

Uh oh!

cuda-course Public

Uh oh!

taskflow-dly Public

Uh oh!

Stream-HLS-dly Public

Uh oh!

iree-dly Public

Uh oh!

onnxruntime Public

Uh oh!

fastllm Public

Uh oh!

byteir Public

Uh oh!

toy-project Public

Uh oh!

micro-polyaie Public

Uh oh!

tvm-dly Public

Uh oh!

tpu-mlir Public

Uh oh!

BladeDISC Public

Uh oh!

allo Public

Uh oh!

buddy-mlir Public

Uh oh!

mlir-air Public

Uh oh!

Module-0 Public template

Uh oh!

llmsys_s24_hw1 Public

Uh oh!

torch-mlir Public

Uh oh!

xla Public

Uh oh!

Halide Public

Uh oh!