Stars
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A powerful tool for creating fine-tuning datasets for LLM
SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
MAGI-1: Autoregressive Video Generation at Scale
Enjoy the magic of Diffusion models!
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, Real-ESRGAN, Real-CUGAN, RTX Video Super Resolution VSR, SRMD, RealSR, Anime4K, RIFE, IF…
A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Wan: Open and Advanced Large-Scale Video Generative Models
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers
A novel approach to hunyuan image-to-video sampling
The official implementation of ”RepVideo: Rethinking Cross-Layer Representation for Video Generation“
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[ICCV 2023, Official Code] for paper "Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives". Official Weights and Demos provided.
FastVideo is a unified framework for accelerated video generation.
A pipeline parallel training script for diffusion models.