8000 GitHub - fluxions-ai/vui
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fluxions-ai/vui

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vui

Small Conversational speech models that can run on device

Installation

uv pip install -e .

Demo

Try on Gradio

python demo.py

Models

  • Vui.BASE is base checkpoint trained on 40k hours of audio conversations
  • Vui.ABRAHAM is a single speaker model that can reply with context awareness.
  • Vui.COHOST is checkpoint with two speakers that can talk to each other.

Voice Cloning

You can clone with the base model quite well but it's not perfect as hasn't seen that much audio / wasn't trained for long

Research

vui is a llama based transformer that predicts audio tokens.

fluac is a audio tokenizer based on descript-audio-codec which reduces the number of codes per second by 4 from 83.1hz to 21.53hz

FAQ

  1. Was developed with on two 4090's https://x.com/harrycblum/status/1752698806184063153
  2. Hallucinations: yes the model does hallucinate, but this is the best I could do with limited resources! :(
  3. VAD does slow things down but needed to help remove areas of silence.

Attributions

Citation

@software{vui_2025,
  author = {Coultas Blum, Harry},
  month = {01},
  title = {{vui}},
  url = {https://github.com/fluxions-ai/vui},
  version = {1.0.0},
  year = {2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0