Stars
Professional CUDA C Programming
A nearly complete collection of prefix sum algorithms implemented in CUDA, D3D12, Unity and WGPU. Theoretically portable to all wave/warp/subgroup sizes.
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Enabling PyTorch on XLA Devices (e.g. Google TPU)
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Homework of CMU 10-414/714: Deep Learning Systems (https://dlsyscourse.org/)
My learning notes/codes for ML SYS.
My solutions to the assignments of CMU 10-714 Deep Learning Systems 2022
Deep learning framework from CMU 10-414/714: Deep Learning Systems
A machine learning compiler for GPUs, CPUs, and ML accelerators
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
An Open Source Machine Learning Framework for Everyone
Assignment 1: automatic differentiation
A simple deep learning framework in pure python for purpose of learning in DL
2023个人v2ray搭建梯子/科学上网/翻墙使用/VPN搭建最新教程/稳定/梯子自建/Google访问/加密代理/v2ray一键脚本
It is open source ebook about TensorFlow kernel and implementation mechanism.
Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍
《深度学习入门:基于Python的理论与实现》电子版及配套代码。
Deep Learning Book Chinese Translation