Ant Group
full_body_en.mp4
✨ For more results, visit our Project Page ✨
- [2025.01.21] 🔥 We update the Colab demo, welcome to try it.
- [2025.01.10] 🔥 We release our inference codes and models.
- [2024.11.29] 🔥 Our paper is in public on arxiv.
Tested Environment
- System: Ubuntu 20.04 (WSL2)
- GPU: NVIDIA RTX 3070
- Python: 3.10
- tensorRT: 8.6.1
git clone https://github.com/satoooh/ditto-talkinghead
cd ditto-talkinghead
# install dependencies
uv sync
# download checkpoints
git lfs install
git clone https://huggingface.co/digital-avatar/ditto-talkinghead checkpoints
# run inference.py
uv run python inference.py --data_root "./checkpoints/ditto_trt_Ampere_Plus" --cfg_pkl "./checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" --audio_path "./example/audio.wav" --source_path "./example/image.png" --output_path "./tmp/result.mp4"
❗Note:
We have provided the tensorRT model with hardware-compatibility-level=Ampere_Plus
(checkpoints/ditto_trt_Ampere_Plus/
). If your GPU does not support it, please execute the cvt_onnx_to_trt.py
script to convert from the general onnx model (checkpoints/ditto_onnx/
) to the tensorRT model.
uv run python script/cvt_onnx_to_trt.py --onnx_dir "./checkpoints/ditto_onnx" --trt_dir "./checkpoints/ditto_trt_custom"
Then run inference.py
with --data_root=./checkpoints/ditto_trt_custom
.