Stars
Official inference repo for FLUX.1 models
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
VideoSys: An easy and efficient system for video generation
CogView4, CogView3-Plus and CogView3(ECCV 2024)
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
Pocket Flow: 100-line LLM framework. Let Agents build Agents!
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
A curated list for Efficient Large Language Models
CUDA Python: Performance meets Productivity
《Pytorch实用教程》(第二版)无论是零基础入门,还是CV、NLP、LLM项目应用,或是进阶工程化部署落地,在这里都有。相信在本书的帮助下,读者将能够轻松掌握 PyTorch 的使用,成为一名优秀的深度学习工程师。
how to learn PyTorch and OneFlow
how to optimize some algorithm in cuda.
An extremely fast Python package and project manager, written in Rust.
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
Analyze computation-communication overlap in V3/R1.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
DeepEP: an efficient expert-parallel communication library