8000 GitHub - satoooh/ditto-talkinghead: Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

satoooh/ditto-talkinghead

 
 

Repository files navigation

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Ant Group


full_body_en.mp4

✨ For more results, visit our Project Page

📌 Updates

  • [2025.01.21] 🔥 We update the Colab demo, welcome to try it.
  • [2025.01.10] 🔥 We release our inference codes and models.
  • [2024.11.29] 🔥 Our paper is in public on arxiv.

🛠️ Installation

Tested Environment

  • System: Ubuntu 20.04 (WSL2)
  • GPU: NVIDIA RTX 3070
  • Python: 3.10
  • tensorRT: 8.6.1
git clone https://github.com/satoooh/ditto-talkinghead
cd ditto-talkinghead

# install dependencies
uv sync

# download checkpoints
git lfs install
git clone https://huggingface.co/digital-avatar/ditto-talkinghead checkpoints

# run inference.py
uv run python inference.py --data_root "./checkpoints/ditto_trt_Ampere_Plus" --cfg_pkl "./checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" --audio_path "./example/audio.wav" --source_path "./example/image.png" --output_path "./tmp/result.mp4"

❗Note:

We have provided the tensorRT model with hardware-compatibility-level=Ampere_Plus (checkpoints/ditto_trt_Ampere_Plus/). If your GPU does not support it, please execute the cvt_onnx_to_trt.py script to convert from the general onnx model (checkpoints/ditto_onnx/) to the tensorRT model.

uv run python script/cvt_onnx_to_trt.py --onnx_dir "./checkpoints/ditto_onnx" --trt_dir "./checkpoints/ditto_trt_custom"

Then run inference.py with --data_root=./checkpoints/ditto_trt_custom.

About

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • Other 1.6%
0