Stars
A Native Multimodal LLM for 3D Generation and Understanding
Godot addons from Cozy Cube Games
ComfyUI-Bagel is now available in ComfyUI, BAGEL is an open‑source multimodal foundation model with 7B active parameters (14B total) trained on large‑scale interleaved multimodal data.
A ComfyUI extention for BAGEL(Unified Model for Multimodal Understanding and Generation)
Official project page of MTVCrafter, a novel framework for general and high-quality human image animation using raw 3D motion sequences.
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
[SIGGRAPH 2025] Official code of the paper "FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios"
In-context subject-driven image generation while preserving foreground fidelity
DreamO: A Unified Framework for Image Customization
ACE-Step: A Step Towards Music Generation Foundation Model
🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Convert Mixamo animations directly to OpenPose image sequences
colinurbs / FramePack-Studio
Forked from lllyasviel/FramePackExpanding FramePack into a multifunction video creation tool
Transparent Image Layer Diffusion using Latent Transparency
HoloPart: Generative 3D Part Amodal Segmentation
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
FastVideo is a unified framework for accelerated video generation.
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer