F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

F5-TTS: Diffusion Transformer with ConvNeXt V2, faster trained and inference.

E2 TTS: Flat-UNet Transformer, closest reproduction from paper.

Sway Sampling: Inference-time flow step sampling strategy, greatly improves performance

Thanks to all the contributors !

News

2025/03/12: 🔥 F5-TTS v1 base model with better training and inference performance. Few demo.
2024/10/08: F5-TTS & E2 TTS base models on 🤗 Hugging Face, 🤖 Model Scope, 🟣 Wisemodel.

Installation

Create a separate environment if needed

# Create a python 3.10 conda env (you could also use virtualenv)
conda create -n f5-tts python=3.10
conda activate f5-tts

Install PyTorch with matched device

NVIDIA GPU

# Install pytorch with your CUDA version, e.g.
pip install torch==2.4.0+cu124 torchaudio==2.4.0+cu124 --extra-index-url https://download.pytorch.org/whl/cu124

AMD GPU

# Install pytorch with your ROCm version (Linux only), e.g.
pip install torch==2.5.1+rocm6.2 torchaudio==2.5.1+rocm6.2 --extra-index-url https://download.pytorch.org/whl/rocm6.2

Intel GPU

# Install pytorch with your XPU version, e.g.
# Intel® Deep Learning Essentials or Intel® oneAPI Base Toolkit must be installed
pip install torch torchaudio --index-url https://download.pytorch.org/whl/test/xpu

# Intel GPU support is also available through IPEX (Intel® Extension for PyTorch)
# IPEX does not require the Intel® Deep Learning Essentials or Intel® oneAPI Base Toolkit
# See: https://pytorch-extension.intel.com/installation?request=platform

Apple Silicon

# Install the stable pytorch, e.g.
pip install torch torchaudio

Then you can choose one from below:

1. As a pip package (if just for inference)

pip install f5-tts

Name		Name	Last commit message	Last commit date
Latest commit History 561 Commits
.github		.github
ckpts		ckpts
data		data
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml

Model	Concurrency	Avg Latency	RTF	Mode
F5-TTS Base (Vocos)	2	253 ms	0.0394	Client-Server
F5-TTS Base (Vocos)	1 (Batch_size)	-	0.0402	Offline TRT-LLM
F5-TTS Base (Vocos)	1 (Batch_size)	-	0.1467	Offline Pytorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Thanks to all the contributors !

News

Installation

Create a separate environment if needed

Install PyTorch with matched device

Then you can choose one from below:

1. As a pip package (if just for inference)

2. Local editable (if also do training, finetuning)

Docker usage also available

Runtime

Benchmark Results

Inference

1. Gradio App

2. CLI Inference

Training

1. With Hugging Face Accelerate

2. With Gradio App

Evaluation

Development

Acknowledgements

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

sstechsdk/F5-TTS

Folders and files

Latest commit

History

Repository files navigation

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Thanks to all the contributors !

News

Installation

Create a separate environment if needed

Install PyTorch with matched device

Then you can choose one from below:

1. As a pip package (if just for inference)

2. Local editable (if also do training, finetuning)

Docker usage also available

Runtime

Benchmark Results

Inference

1. Gradio App

2. CLI Inference

Training

1. With Hugging Face Accelerate

2. With Gradio App

Evaluation

Development

Acknowledgements

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages