-
Georgia Institute of Technology
- Atlanta, GA
-
18:42
(UTC -04:00) - https://liushuojiang.github.io/
- in/shuojiang-liu
Highlights
- Pro
Lists (25)
Sort Name ascending (A-Z)
🪙 Algorithms
🤖 Chatbot
📲 ChatGPT
🖥️ Computer Systems
🪄 CPP
⏳ CUDA Related
🛢️ Database
📖 General Resources
🥊 Go
🎨 Graphics
🕸️ Graphs
🎢 HPC
🏢 Interviews
🏷️ Leetcode
💎 LLM
⚗️ ML
🛠️ ML Tools
🗂️ Others
🐍 Python Fundamentals
🎮 RL
🧮 Scientific Computing
🔐 Security
🔨 Tools
🍉 Watermelon
🛜 Web Dev
Starred repositories
Solutions for Object Oriented Design Problems
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser & Trae AI (And other Open Sourced) System Prompts, Tools & AI Models.
zhouwg / ggml-hexagon
Forked from ggml-org/llama.cpptry to build a fully open-source ggml-hexagon backend for llama.cpp on Android phone equipped with Qualcomm's Hexagon NPU, details can be seen at https://github.com/zhouwg/ggml-hexagon/discussions/18
chraac / llama.cpp
Forked from ggml-org/llama.cppLLM inference in C/C++
A Datacenter Scale Distributed Inference Serving Framework
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Utilities intended for use with Llama models.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
LostRuins / koboldcpp
Forked from ggml-org/llama.cppRun GGUF models easily with a KoboldAI UI. One File. Zero Install.
Integrate the DeepSeek API into popular softwares
real time face swap and one-click video deepfake with only a single image
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Introduction to Machine Learning Systems
Efficient Triton Kernels for LLM Training
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
A concise but complete full-attention transformer with a set of promising experimental features from various papers
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.