HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

A Pytorch1.4 implementation of HiFi-GAN vocoder adapted from the official github repository, with slight modification on Mel-spectrogram preprocessing.

In the original repo, input mel-spectrogram of target audio is extracted base on pytorch.stft on the fly, while in this repo, input mel-spectrogram of target audio is loaded from stored result of data preprocess.

Training

Prepare your dataset.
- Extract mel-spectrogram from training audio data;
- Divide dataset into training set & validation set;
- (Optional) Implement your own get_dataset_filelist method in meldataset.py to handle metadata.
Specify following parameters and run train.py to start trainging:
- input_training_file: metadata file of training set;
- input_validation_file: metadata file of validation set;
- input_mel_dir: directory where input mel-spectrograms are stored;
- input_wavs_dir: directory where target audio data are stored (in .npy)
- checkpoint_path: directory to save model & trainging logs
- config: path of configuration file (in JSON, e.g. config.json)
- other optional parameters.

Inference

Download a pretrained HiFi-GAN generator model and corresponding JSON configuration file into the same directory.
Follow the example in inference.py, construct a Vocoder instance by specifying the path of the checkpoint file of Generator to be loaded.
Waveform now could be generated from input mel-spectrogram with interface Vocoder.mel2wav.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LJSpeech-1.1		LJSpeech-1.1
mel_extractor @ 2f64960		mel_extractor @ 2f64960
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
config.json		config.json
config_v1.json		config_v1.json
config_v2.json		config_v2.json
config_v3.json		config_v3.json
env.py		env.py
meldataset.py		meldataset.py
models.py		models.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py
validation_loss.png		validation_loss.png
vocoder.py		vocoder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Training

Inference

About

Uh oh!

Releases

Packages

Languages

License

thuhcsi/hifi-gan

Folders and files

Latest commit

History

Repository files navigation

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Training

Inference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages