8000 Allemon / Starred ยท GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Allemon's full-sized avatar

Block or report Allemon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Towards Human-Sounding Speech

Python 5,044 414 Updated May 6, 2025

Noise supression using deep filtering

Python 3,124 295 Updated Oct 17, 2024
Python 7,823 524 Updated Apr 14, 2024

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.

C# 2,695 245 Updated Jun 18, 2025

Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)

Python 782 128 Updated Jun 18, 2025

Inference and training library for high-quality TTS models.

Python 5,303 562 Updated Dec 10, 2024

๐Ÿ”Š Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. ๐ŸŽง๐Ÿ‘ฅ๐Ÿ“Š Advanced audio processing.

Python 245 23 Updated Jun 10, 2024

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1,911 205 Updated May 17, 2025

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The โ€ฆ

TypeScript 25,325 2,588 Updated Jun 19, 2025

SOTA Open Source TTS

Python 21,904 1,789 Updated Jun 12, 2025

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, Dโ€ฆ

HTML 1,870 204 Updated Jun 9, 2025

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 1,651 163 Updated Jul 15, 2024

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 14,304 1,992 Updated Nov 19, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 47,807 5,263 Updated Jun 18, 2025

Ukrainian TTS (text-to-speech) using ESPNET

Python 220 22 Updated Mar 8, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 12,354 1,775 Updated Jun 11, 2025

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,799 572 Updated Aug 10, 2024

๐Ÿธ๐Ÿ’ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 40,811 5,271 Updated Aug 16, 2024

Industry leading face manipulation platform

Python 23,415 3,655 Updated Jun 17, 2025

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python 2,421 399 Updated Jun 16, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 16,311 1,750 Updated Jun 8, 2025

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 7,724 889 Updated Jun 18, 2025

Clips AI is an open-source Python library that automatically converts long videos into clips.

Python 294 51 Updated Jan 17, 2024

Deepfakes Software For All

Python 54,136 13,416 Updated May 21, 2025

๐Ÿ”Š Text-Prompted Generative Audio Model

Jupyter Notebook 38,039 4,518 Updated Aug 19, 2024

Automatically detect and skip intro/credit sequences in Jellyfin

C# 1,316 74 Updated Jun 17, 2025

Unsupervised detection of opening / closing credits, recaps, and previews in video files ๐ŸŽฅ๐Ÿฟ๐ŸŽฌ

Python 97 15 Updated Dec 16, 2024

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Python 2,368 171 Updated Jun 11, 2025

Focus on prompting and generating

Python 45,426 7,139 Updated Jan 24, 2025

Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset

Python 62 7 Updated Jan 18, 2022
Next
0