Lists (9)
Sort Name ascending (A-Z)
Starred repositories
Out of time: automated lip sync in the wild
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩
📖 A curated list of resources dedicated to talking face.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko,…
This repository contains the source code for the paper First Order Motion Model for Image Animation
Multilingual Voice Understanding Model
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras
Speech emotion recognition implemented in Keras (LSTM, CNN, SVM, MLP) | 语音情感识别
[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A latent text-to-image diffusion model
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.