8000 Allemon / Starred · GitHub

More Web Proxy on the site http://driver.im/

Allemon

Follow

Allemon

Follow

1 follower · 0 following

Stars

canopyai / Orpheus-TTS

Towards Human-Sounding Speech

Python 5,044 414 Updated May 6, 2025

Rikorose / DeepFilterNet

Noise supression using deep filtering

Python 3,124 295 Updated Oct 17, 2024

deep-floyd / IF

Python 7,823 524 Updated Apr 14, 2024

mcmonkeyprojects / SwarmUI

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.

C# 2,695 245 Updated Jun 18, 2025

nomadkaraoke / python-audio-separator

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)

Python 782 128 Updated Jun 18, 2025

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 5,303 562 Updated Dec 10, 2024

davidmartinrius / speech-dataset-generator

🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.

Python 245 23 Updated Jun 10, 2024

jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1,911 205 Updated May 17, 2025

invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The …

TypeScript 25,325 2,588 Updated Jun 19, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 21,904 1,789 Updated Jun 12, 2025

erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, D…

HTML 1,870 204 Updated Jun 9, 2025

adefossez / demucs

Forked from facebookresearch/demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 1,651 163 Updated Jul 15, 2024

neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 14,304 1,992 Updated Nov 19, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 47,807 5,263 Updated Jun 18, 2025

robinhad / ukrainian-tts

Ukrainian TTS (text-to-speech) using ESPNET

Python 220 22 Updated Mar 8, 2025

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 12,354 1,775 Updated Jun 11, 2025

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,799 572 Updated Aug 10, 2024

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 40,811 5,271 Updated Aug 16, 2024

facefusion / facefusion

Industry leading face manipulation platform

Python 23,415 3,655 Updated Jun 17, 2025

IAHispano / Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python 2,421 399 Updated Jun 16, 2025

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 16,311 1,750 Updated Jun 8, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 7,724 889 Updated Jun 18, 2025

ClipsAI / clipsai

Clips AI is an open-source Python library that automatically converts long videos into clips.

Python 294 51 Updated Jan 17, 2024

deepfakes / faceswap

Deepfakes Software For All

Python 54,136 13,416 Updated May 21, 2025

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 38,039 4,518 Updated Aug 19, 2024

intro-skipper / intro-skipper

Automatically detect and skip intro/credit sequences in Jellyfin

C# 1,316 74 Updated Jun 17, 2025

nielstenboom / recurring-content-detector

Unsupervised detection of opening / closing credits, recaps, and previews in video files 🎥🍿🎬

Python 97 15 Updated Dec 16, 2024

ZhengPeng7 / BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Python 2,368 171 Updated Jun 11, 2025

lllyasviel / Fooocus

Focus on prompting and generating

Python 45,426 7,139 Updated Jan 24, 2025

okankop / ASDNet

Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset

Python 62 7 Updated Jan 18, 2022

0