Official implementation for the paper "GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis" in ACM MM 2024. [Paper]
Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li. China.
Audio samples are available on our Website.
- Installing Anaconda and Python (our version == 3.8.10).
- Creating the new environment for Groot and installing the requirements.
conda create -n Groot python=3.8 conda activate Groot pip install -r requirements.txt
Downloading the pretrained models and please place them into 📁pretrain/
.
Pretrained models can be downloaded at GoogleDrive.
As the paper described, we provide DiffWave as the diffusion model.
The pretrained model of DiffWave can be downloaded at GoogleDrive and please also place it into 📁pretrain/
.
We also provide the links for WaveGrad and PriorGrad.
${Groot}
|-- diffwave
|-- pretrain <-- the downloaded pretrained models
|-- inference.py
|-- model.py
|-- other python codes, config, LICENSE and README files
The pretrained models correspond to LJspeech dataset. Here, we provide the link to download LJspeech.
The LibriTTS and LibriSpeech datasets can be downloaded from torchaudio.
You can utilize pre-trained models to assess Groot's performance at 100 bps capacity using the LJSpeech dataset.
python inference.py --dataset_path path_to_your_test_dataset \
--encoder path_to_encoder \
--decoder path_to_decoder \
--diffwave path_to_generative_model
[1] DiffWave: 📰[paper] 💻[code]. [2] WaveGrad: 📰[paper] 💻[code]. [3] PriorGrad: 📰[paper] 💻[code].
This project is released under the MIT license. See LICENSE for details.
If you find the code and dataset useful in your research, please consider citing our paper:
@inproceedings{liu2024groot,
title={GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis},
author={Liu, Weizhi and Li, Yue and Lin, Dongdong and Tian, Hui and Li, Haizhou},
booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
year={2024}
}