GitHub - satoooh/ditto-talkinghead: Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Tianqi Li · Ruobing Zheng^† · Minghui Yang · Jingdong Chen · Ming Yang

Ant Group

full_body_en.mp4

✨ For more results, visit our Project Page ✨

📌 Updates

[2025.01.21] 🔥 We update the Colab demo, welcome to try it.
[2025.01.10] 🔥 We release our inference codes and models.
[2024.11.29] 🔥 Our paper is in public on arxiv.

🛠️ Installation

Tested Environment

System: Ubuntu 20.04 (WSL2)
GPU: NVIDIA RTX 3070
Python: 3.10
tensorRT: 8.6.1

git clone https://github.com/satoooh/ditto-talkinghead
cd ditto-talkinghead

# install dependencies
uv sync

# download checkpoints
git lfs install
git clone https://huggingface.co/digital-avatar/ditto-talkinghead checkpoints

# run inference.py
uv run python inference.py --data_root "./checkpoints/ditto_trt_Ampere_Plus" --cfg_pkl "./checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" --audio_path "./example/audio.wav" --source_path "./example/image.png" --output_path "./tmp/result.mp4"

❗Note:

We have provided the tensorRT model with hardware-compatibility-level=Ampere_Plus (checkpoints/ditto_trt_Ampere_Plus/). If your GPU does not support it, please execute the cvt_onnx_to_trt.py script to convert from the general onnx model (checkpoints/ditto_onnx/) to the tensorRT model.

uv run python script/cvt_onnx_to_trt.py --onnx_dir "./checkpoints/ditto_onnx" --trt_dir "./checkpoints/ditto_trt_custom"

Then run inference.py with --data_root=./checkpoints/ditto_trt_custom.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
core		core
example		example
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
8000 inference.py		inference.py
pyproject.toml		pyproject.toml
stream_pipeline_offline.py		stream_pipeline_offline.py
stream_pipeline_online.py		stream_pipeline_online.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

📌 Updates

🛠️ Installation

About

Uh oh!

Releases

Packages

Languages

License

satoooh/ditto-talkinghead

Folders and files

Latest commit

History

Repository files navigation

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

📌 Updates

🛠️ Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages