Stars
SkyReels-V2: Infinite-length Film Generative model
This SDK is now deprecated, use the unified Firebase SDK.
On-device AI across mobile, embedded and edge for PyTorch
project-deepform / deepform
Forked from jstray/deepformExperimental form data extraction for journalism
The official Python library for the OpenAI API
OCR Annotations from Amazon Textract for Industry Documents Library
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
An implementation of "CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model".
The official code for “Deep Unrestricted Document Image Rectification”, TMM, 2023.
Implementation of Nougat Neural Optical Understanding for Academic Documents
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Mora: More like Sora for Generalist Video Generation
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
Official codes of CCSRv2 and CCSRv1: Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS@2023 Spotlight, TPAMI@2024)
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC ac…
System design patterns for machine learning
A high-throughput and memory-efficient inference and serving engine for LLMs
Large Language Model Text Generation Inference
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Robust recipes to align language models with human and AI preferences
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)