8000 tp-nan (Nan) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View tp-nan's full-sized avatar
  • NetEase
  • HangZhou
  • 16:22 (UTC -12:00)

Organizations

@mfem @torchpipe

Block or report tp-nan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 49,053 7,032 Updated Apr 20, 2025

Curated collection of papers in machine learning systems

333 17 Updated Apr 3, 2025

Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.

Python 41 3 Updated May 14, 2025

C++ functions matching the interface and behavior of python string methods with std::string

C++ 1,006 162 Updated May 14, 2025

Writing AI Conference Papers: A Handbook for Beginners

2,349 76 Updated May 8, 2025

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 5,530 169 Updated May 13, 2025

A beautiful stack trace pretty printer for C++

C++ 4,013 510 Updated Apr 14, 2025
Java 5,977 1,927 Updated May 13, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,762 294 Updated Mar 10, 2025

String Formatting Library for C++17

C++ 1 Updated Feb 27, 2021

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,007 986 Updated May 15, 2025

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 38,645 3,029 Updated May 15, 2025

MLeap: Deploy ML Pipelines to Production

Scala 1,515 314 Updated Nov 27, 2024

Fruit, a dependency injection framework for C++

C++ 1,838 202 Updated May 11, 2025

Implementation of std::experimental::any, including small object optimization, for C++11 compilers

C++ 148 36 Updated May 1, 2024

Header-only C++ binding for libzmq

C++ 2,100 779 Updated Apr 23, 2025

Serialization library written in C++17 - Pack C++ structs into a compact byte-array without any macros or boilerplate code

C++ 509 43 Updated Sep 30, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,547 661 Updated Feb 10, 2025

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

C++ 557 117 Updated Mar 20, 2025

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 294 36 Updated May 14, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 14,343 1,758 Updated May 15, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 807 36 Updated May 10, 2025

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Jupyter Notebook 1,443 113 Updated May 2, 2025

Universal cross-platform tokenizers binding to HF and sentencepiece

C++ 330 77 Updated May 3, 2025

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents…

Python 133 10 Updated May 14, 2025

Code and information for face image quality assessment with SER-FIQ

Python 554 91 Updated Dec 9, 2022

A high-performance inference system for large language models, designed for production environments.

C++ 438 35 Updated May 15, 2025
Next
0