8000 fanyangCS (fan yang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View fanyangCS's full-sized avatar

Block or report fanyangCS

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

Cuda 84 6 Updated May 9, 2025

Free, simple, fast interactive diagrams for any GitHub repository

TypeScript 11,233 771 Updated Apr 24, 2025

Technical report of Kimina-Prover Preview.

277 8 Updated May 10, 2025
C++ 11 2 Updated Feb 28, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,121 82 Updated May 10, 2025

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 608 47 Updated May 5, 2025

We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstra…

C++ 181 11 Updated Jan 28, 2025

FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of lists of statically-shaped tensors, referred to as a Fractal…

Python 26 4 Updated Dec 21, 2024

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 12,939 11,795 Updated Apr 22, 2025

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Python 155 8 Updated Sep 21, 2024

nnScaler: Compiling DNN models for Parallel Training

Python 110 14 Updated Apr 29, 2025

Low-bit LLM inference on CPU with lookup table

C++ 770 59 Updated Apr 22, 2025

MSVBASE is a system that efficiently supports complex queries of both approximate similarity search and relational operators. It integrates high-dimensional vector indices into PostgreSQL, a relati…

C++ 93 12 Updated Nov 19, 2024
Python 143 13 Updated Jul 22, 2024
Python 74 1 Updated Feb 22, 2023

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Python 14,186 1,819 Updated Jul 3, 2024

A unified 3D Transformer Pipeline for visual synthesis

2,811 165 Updated May 29, 2023

Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4

C 819 96 Updated May 10, 2025

A validation and profiling tool for AI infrastructure

Python 309 65 Updated May 1, 2025

System for AI Education Resource.

Python 3,985 507 Updated Oct 25, 2024

An experimental parallel training platform

54 15 Updated Mar 25, 2024

Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.

C++ 469 47 Updated Apr 20, 2025

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 982 165 Updated Sep 19, 2024

A decoupled transaction component providing transaction processing for applications

C++ 6 2 Updated Jul 15, 2020

Resource scheduling and cluster management for AI

JavaScript 2,663 548 Updated Jun 6, 2024

OpenPAI SDK

TypeScript 19 16 Updated Dec 10, 2022

Extension to connect OpenPAI clusters, submit AI jobs, simulate jobs locally, manage files, and so on.

TypeScript 14 5 Updated Dec 10, 2022

A marketplace which stores examples and job templates of openpai. Users could use openpaimarketplace to share their jobs or run-and-learn others' sharing job.

JavaScript 33 21 Updated Dec 13, 2022

Runtime for deep learning workload

Python 20 16 Updated May 24, 2022
Next
0