Stars
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
DreamO: A Unified Framework for Image Customization
KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Pythonic AI generation of images and videos
Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion
[Support 0.49.x](Reset Cursor AI MachineID & Bypass Higher Token Limit) Cursor Ai ,自动重置机器ID , 免费升级使用Pro功能: You've reached your trial request limit. / Too many free trial accounts used on this machi…
Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
Liquid: Language Models are Scalable and Unified Multi-modal Generators
PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask
ACTalker: an end-to-end video diffusion framework for talking head synthesis that supports both single and multi-signal control (e.g., audio, expression).
Awesome Instruction Editing. Image and Media Editing with Human Instructions. Instruction-Guided Image and Media Editing.
Enjoy the magic of Diffusion models!
High quality training free inpaint for every stable diffusion model.
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
[CVPR-2025] The official code of HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
LBM: Latent Bridge Matching for Fast Image-to-Image Translation ✨
introduce video face restoration method
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Official implementations for paper: VACE: All-in-One Video Creation and Editing
🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."