8000 gipsyh (Yuheng Su) / Starred Β· GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View gipsyh's full-sized avatar

Highlights

  • Pro

Block or report gipsyh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Rust 2 Updated May 25, 2025

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust 4,372 183 Updated May 24, 2025

RDMA core userspace libraries and daemons

C 1,806 745 Updated May 6, 2025

A lightweight design for computation-communication overlap.

Cuda 128 5 Updated May 6, 2025

nnScaler: Compiling DNN models for Parallel Training

Python 112 15 Updated Apr 29, 2025

MPI bindings for Rust

Rust 535 57 Updated Apr 24, 2025

Optimized primitives for collective multi-GPU communication

C++ 3,732 920 Updated May 20, 2025

Distributed Triton for Parallel Systems

Python 761 49 Updated May 26, 2025

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Python 275 21 Updated Nov 3, 2023

ModelChecker: A bit-level model checking tool

C++ 7 1 Updated Mar 18, 2025

Fast and memory-efficient exact attention

Python 17,505 1,696 Updated May 22, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 1,195 94 Updated May 25, 2025

πŸ€– Chat with your SQL database πŸ“Š. Accurate Text-to-SQL Generation via LLMs using RAG πŸ”„.

Python 17,817 1,587 Updated Apr 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,373 601 Updated May 20, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,696 773 Updated May 23, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,572 836 Updated Apr 29, 2025

CaDiCaL SAT Solver

C++ 447 150 Updated May 23, 2025

A Fast, Low-Overhead On-chip Network

SystemVerilog 206 37 Updated May 23, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 50,117 7,255 Updated Apr 20, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,300 2,239 Updated Feb 1, 2025

BTOR2 MLIR project

C++ 25 7 Updated Jan 17, 2024

Random Generator of Btor2 Files

C++ 10 Updated Sep 2, 2023

Bit-bLAsting solving Non-linear integer constraints.

C++ 21 1 Updated Jul 12, 2024

Typed distributed plugin registration

Rust 1,112 49 Updated Mar 3, 2025

Equivalence checking with Yosys

Python 43 7 Updated May 6, 2025

A tool to convert btor2 files to LLVM.

Python 7 1 Updated Dec 29, 2020

A superoptimizer for LLVM IR

C++ 2,233 175 Updated Aug 28, 2024
C++ 3 Updated Mar 4, 2025

SystemVerilog compiler and language services

C++ 749 158 Updated May 25, 2025
Next
0