vui

Small Conversational speech models that can run on device

Installation

uv pip install -e .

Demo

Try on Gradio

python demo.py

Models

Vui.BASE is base checkpoint trained on 40k hours of audio conversations
Vui.ABRAHAM is a single speaker model that can reply with context awareness.
Vui.COHOST is checkpoint with two speakers that can talk to each other.

Voice Cloning

You can clone with the base model quite well but it's not perfect as hasn't seen that much audio / wasn't trained for long

Research

vui is a llama based transformer that predicts audio tokens.

fluac is a audio tokenizer based on descript-audio-codec which reduces the number of codes per second by 4 from 83.1hz to 21.53hz

FAQ

Was developed with on two 4090's https://x.com/harrycblum/status/1752698806184063153
Hallucinations: yes the model does hallucinate, but this is the best I could do with limited resources! :(
VAD does slow things down but needed to help remove areas of silence.

Attributions

Whisper - https://github.com/openai/whisper
Audiocraft - https://github.com/facebookresearch/audiocraft
Descript Audio Codec - https://github.com/descriptinc/descript-audio-codec

Citation

@software{vui_2025,
  author = {Coultas Blum, Harry},
  month = {01},
  title = {{vui}},
  url = {https://github.com/fluxions-ai/vui},
  version = {1.0.0},
  year = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
src/vui		src/vui
.gitignore		.gitignore
LICENSE		LICENSE
demo.py		demo.py
inference.ipynb		inference.ipynb
inference.py		inference.py
pyproject.toml		pyproject.toml
readme.md		readme.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vui

Installation

Demo

Models

Voice Cloning

Research

FAQ

Attributions

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

fluxions-ai/vui

Folders and files

Latest commit

History

Repository files navigation

vui

Installation

Demo

Models

Voice Cloning

Research

FAQ

Attributions

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages