Stars
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
Production-ready platform for agentic workflow development.
Video generation from text&image, 1st-gen
SubPlayer is no longer maintained, please consider Aimu
AI powered speech denoising and enhancement
Production First and Production Ready End-to-End Speech Recognition Toolkit
📖 A curated list of resources dedicated to talking face.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction…
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
arch1t3cht / Aegisub
Forked from TypesettingTools/AegisubCross-platform advanced subtitle editor, with new feature branches. Read the README on the feature branch.
An XMPP library for Unity 3D targeting C# .NET 2.0 is used to implement communication with an XMPP (Extensible Messaging and Presence Protocol) server. This is useful for chat systems, real-time me…
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
CVPR2023 talking face implementation for Identity-Preserving Talking Face Generation With Landmark and Appearance Priors
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
The Project is real time application in opencv using first order model
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation