Stars
Paint by numbers generator
PictureColorDiffusion is a program that automate 2d colorization of grayscale drawings using Automatic111 Stable Diffusion's WebUI API, it's interrogation feature and the controlnet extension. Addi…
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
zero-shot voice conversion & singing voice conversion, with real-time support
智能视频多语言AI配音/翻译工具 - Linly-Dubbing — “AI赋能,语言无界”
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
Easily train a good VC model with voice data <= 10 mins!
Codename's rvc fork version 3, based on Applio.
🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.
Interpolate and Upscale easily on Linux/Windows/MacOS.
FluidFrames | video AI frame-generation app
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
[ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".
Video optimizing, upscaling, interpolating and stream manipulation with hardware acceleration on Web user interface.
rafaelperez / RIFE-for-Nuke
Forked from hzwer/ECCV2022-RIFEECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
效果更好的补帧软件,显存占用更小,是DAIN速度的10-25倍,包含抽帧处理,去除动漫卡顿感
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal is…
Real-time voice-changer for voice-chat, etc. Will support many different voice-filters and features in the future. 🎵
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…