8000 DonkeyHang (Rx.Hang) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View DonkeyHang's full-sized avatar

Block or report DonkeyHang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,027 186 Updated Dec 22, 2023

TTS with kokoro and onnx runtime

Python 1,966 190 Updated May 10, 2025

[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python 3,853 426 Updated Dec 10, 2024

Audio-FLAN

149 4 Updated Mar 6, 2025

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the …

Python 988 143 Updated Apr 29, 2025

Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"

Python 2,697 229 Updated May 7, 2025

dog-can-sing-song

Python 23 4 Updated May 11, 2025

Noise supression using deep filtering

Python 27 4 Updated May 23, 2024

Limiter, compressor, convolver, equalizer and auto volume and many other plugins for PipeWire applications

C++ 7,245 292 Updated May 13, 2025

mnn asr demo.

C++ 16 1 Updated Mar 24, 2025

Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint

Python 68 9 Updated Jun 25, 2023

LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation with Spoken Language Models" (arXiv 2024).

65 1 Updated Dec 28, 2024

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 760 83 Updated Apr 1, 2025

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 3,466 341 Updated Apr 27, 2025

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 20,774 2,427 Updated Apr 30, 2025

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,759 215 Updated Apr 30, 2025

PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control

Python 342 40 Updated May 11, 2025

[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System

Python 3,383 339 Updated Apr 15, 2025

Interface for OuteTTS models.

Python 1,222 103 Updated Apr 29, 2025

An efficient speech separation method

Python 274 34 Updated Apr 11, 2024

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Python 225 23 Updated Jul 31, 2024

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Python 239 28 Updated Jul 13, 2023

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 13,808 1,414 Updated May 6, 2025

A lightweight end-to-end text-to-speech model

Python 114 13 Updated Feb 23, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,196 688 Updated May 7, 2025

zero-shot voice conversion & singing voice conversion, with real-time support

Python 2,450 278 Updated Apr 20, 2025

An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".

Python 67 7 Updated Apr 15, 2025

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)

C 2,864 673 Updated Feb 5, 2025

A real-time voice conversion model based on VITS.

Python 9 2 Updated Aug 1, 2024

SOFA: Singing-Oriented Forced Aligner

Python 168 24 Updated Apr 16, 2025
Next
0