8000 GitHub - SCAILab-USTC/STSA: Pytorch implementation for our ICME2025 submission "STSA: Spatial-Temporal Semantic Alignment for Facial Visual Dubbing".
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Pytorch implementation for our ICME2025 submission "STSA: Spatial-Temporal Semantic Alignment for Facial Visual Dubbing".

Notifications You must be signed in to change notification settings

SCAILab-USTC/STSA

Repository files navigation

STSA: Spatial-Temporal Semantic Alignment for Facial Visual Dubbing

Pytorch implementation for our ICME2025 submission "STSA: Spatial-Temporal Semantic Alignment for Facial Visual Dubbing".

YouTube framework

Todo:

  • inference code
  • paper & supplementary material
  • youtube demo
  • training code
  • fine-tuning code

Demo:

Multilingual Generation

chinese.mp4
korean.mp4
japanese.mp4
spanish.mp4

Long Video Generation Compared with SOTA Methods

We compare our method with DiffTalk(CVPR23'), DINet(AAAI23'), IP-LAP(CVPR23'), MuseTalk(Arxiv2024), PC-AVS(CVPR21'), TalkLip(CVPR23'), Wav2Lip(MM'20)

Ours.mp4
DiffTalk.mp4
DINet.mp4
IP-LAP.mp4
MuseTalk.mp4
PC-AVS.mp4
TalkLIp.mp4
Wav2Lip.mp4

Inference:

Requirements

  • Python 3.8.7
  • torch 1.12.1
  • torchvision 0.13.1
  • librosa 0.9.2
  • ffmpeg

Prepare Environment

First create conda environment:

conda create -n stsa python=3.8
conda activate stsa

Pytorch 1.12.1 is used, other requirements are listed in "requirements.txt". Please run:

pip install -r requirements.txt

Quick Start

Download the pretrained weights, and put the weights under ./checkpoints After this, run the following command:

python inference.py --video_path "demo_templates/video/speakerine.mp4" --audio_path "demo_templates/audio/education.wav"

You can specify the --video_path and --audio_path option to inference other videos.

About

Pytorch implementation for our ICME2025 submission "STSA: Spatial-Temporal Semantic Alignment for Facial Visual Dubbing".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0