Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Pipelined FPGA design to accelerate encoder-only transformer inference, implemented on a Xilinx board.
andrewn6 / fromthetransistor
Forked from geohot/fromthetransistorFrom the Transistor to the Web Browser, a rough outline for a 12 week course
📚 Learn to write an embedded OS in Rust 🦀
A driving dataset for the development and validation of fused pose estimators and mapping algorithms
Efficient Triton Kernels for LLM Training
LSi - Autonomous RL Agent for Microgrid Management using Proximal Policy Optimization - https://autogrid-dashboard.vercel.app/dashboard
FlashMLA: Efficient MLA decoding kernels
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
A 16-bit RISC CPU with 32 instructions built with Digital for running on an FPGA.
Transformer Architecture written with CUDA, C++ and LibTorch.
Here's all my Python/Numba (CUDA) code for the encoder block I made :)
An open source GPU based off of the AMD Southern Islands ISA.
OpenSource GPU, in Verilog, loosely based on RISC-V ISA
Rendering rudimentary 3D meshes on a DE1-SoC FPGA by use of a VGA display using verilog.
geohot / F32Ghidra
Forked from uuuvn/F32GhidraGhidra Plugin for AMD's F32 Processor
Collection of leaked system prompts
C-based/Cached/Core Computer Vision Library, A Modern Computer Vision Library
Marvin: A Minimalist GPU-only N-Dimensional ConvNets Framework
Fast and memory-efficient exact attention
Development repository for the Triton language and compiler
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.