8000 tp-nan (Nan) / Starred · GitHub

More Web Proxy on the site http://driver.im/

tp-nan

Follow

Nan tp-nan

Follow

14 followers · 40 following

NetEase
HangZhou
16:22 (UTC -12:00)

Achievements

Achievements

Organizations

Lists (3)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

Starred repositories

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 49,053 7,032 Updated Apr 20, 2025

byungsoo-oh / ml-systems-papers

Curated collection of papers in machine learning systems

333 17 Updated Apr 3, 2025

SqueezeBits / Torch-TRTLLM

Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.

Python 41 3 Updated May 14, 2025

imageworks / pystring

C++ functions matching the interface and behavior of python string methods with std::string

C++ 1,006 162 Updated May 14, 2025

hzwer / WritingAIPaper

Writing AI Conference Papers: A Handbook for Beginners

2,349 76 Updated May 8, 2025

XuehaiPan / nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 5,530 169 Updated May 13, 2025

bombela / backward-cpp

A beautiful stack trace pretty printer for C++

C++ 4,013 510 Updated Apr 14, 2025

FongMi / TV

Java 5,977 1,927 Updated May 13, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,762 294 Updated Mar 10, 2025

Andres6936 / Flossy

Forked from ongbe/flossy

String Formatting Library for C++17

C++ 1 Updated Feb 27, 2021

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,007 986 Updated May 15, 2025

unslothai / unsloth

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 38,645 3,029 Updated May 15, 2025

combust / mleap

MLeap: Deploy ML Pipelines to Production

Scala 1,515 314 Updated Nov 27, 2024

google / fruit

Fruit, a dependency injection framework for C++

C++ 1,838 202 Updated May 11, 2025

thelink2012 / any

Implementation of std::experimental::any, including small object optimization, for C++11 compilers

C++ 148 36 Updated May 1, 2024

zeromq / cppzmq

Header-only C++ binding for libzmq

C++ 2,100 779 Updated Apr 23, 2025

p-ranav / alpaca

Serialization library written in C++17 - Pack C++ structs into a compact byte-array without any macros or boilerplate code

C++ 509 43 Updated Sep 30, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,547 661 Updated Feb 10, 2025

infinigence / Infini-Megrez-Omni

Python 228 11 Updated Feb 21, 2025

NVIDIA / cudnn-frontend

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

C++ 557 117 Updated Mar 20, 2025

neuralmagic / guidellm

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 294 36 Updated May 14, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 14,343 1,758 Updated May 15, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 807 36 Updated May 10, 2025

InfiniTensor / InfiniTensor

C++ 237 58 Updated Feb 12, 2025

merveenoyan / smol-vision

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Jupyter Notebook 1,443 113 Updated May 2, 2025

mlc-ai / tokenizers-cpp

Universal cross-platform tokenizers binding to HF and sentencepiece

C++ 330 77 Updated May 3, 2025

NetEase-Media / grps_trtllm

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents…

Python 133 10 Updated May 14, 2025

pterhoer / FaceImageQuality

Code and information for face image quality assessment with SER-FIQ

Python 554 91 Updated Dec 9, 2022

vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.

C++ 438 35 Updated May 15, 2025

Kirouane-Ayoub / Openai-compatible-api

Python 4 Updated Aug 10, 2024

Starred topics

llm-serving

paper5

0