Transparent proxy server that works as a poor man's VPN. Forwards over ssh. Doesn't require admin. Works with Linux and MacOS. Supports DNS tunneling.

Python 12,319 757 Updated Apr 25, 2025

tvlearn / tvae-audio-restoration

Python 5 Updated Oct 7, 2024

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 207 11 Updated May 7, 2025

salute-developers / GigaAM

Foundational Model for Speech Recognition Tasks

Python 204 25 Updated Mar 4, 2025

facebookresearch / SONAR

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 764 83 Updated Apr 1, 2025

DataoceanAI / Dolphin

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Python 491 30 Updated May 19, 2025

yxlu-0102 / IDEA-TTS

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis

Python 26 4 Updated Mar 21, 2025

yynil / RWKVTTS

This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).

Python 75 9 Updated May 22, 2025

Aria-K-Alethia / BigCodec

Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"

Python 160 13 Updated Sep 19, 2024

canopyai / Orpheus-TTS

Towards Human-Sounding Speech

Python 4,828 389 Updated May 6, 2025

Jiang-Yidi / UniCodec

UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

118 2 Updated Feb 28, 2025

EndlessReform / smoltts

Open TTS models, built for streaming on the edge

Jupyter Notebook 43 4 Updated Mar 16, 2025

mr150 / mlut

Atomic CSS toolkit with Sass and ergonomics for creating styles of any complexity

SCSS 165 2 Updated Apr 25, 2025

chrisdonahue / sheetsage

Transcribe music into lead sheets!

Python 373 80 Updated May 14, 2025

jim-schwoebel / voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1,919 241 Updated Jun 6, 2024

stepfun-ai / Step-Audio

Python 4,303 349 Updated Mar 12, 2025

sanderwood / clamp3

CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages [ACL 2025]

Python 151 6 Updated May 11, 2025

ASLP-lab / DiffRhythm

Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Python 1,640 178 Updated May 10, 2025

SZU-AdvTech-2022 / 303-A-Variational-Em-Acceleration-for-Efficient-Clustering-at-Very-Large-Scales

Python 1 Updated Mar 14, 2023

sony / hFT-Transformer

Pytorch implementation of automatic music transcription method that uses a two-level hierarchical frequency-time Transformer architecture (hFT-Transformer).

Python 101 11 Updated Jul 11, 2023

bytedance / piano_transcription

Python 1,786 219 Updated Aug 18, 2023

magic-research / vector_quantization

[NeurIPS 2024] Image Understanding Makes for A Good Tokenizer for Image Generation

Python 13 Updated Dec 17, 2024

multimodal-art-projection / YuE

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 5,001 553 Updated May 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zakhar Varfolomeev varfolomeeff

Achievements

Achievements

Block or report varfolomeeff

Stars

AIRI-Institute / lagnet-dft

pipecat-ai / pipecat

sobolevn / the-best-python-course

magic-research / Sa2VA

Stability-AI / generative-models

bchao1 / Anime-Face-Dataset

QwenLM / Qwen-Audio

sshuttle / sshuttle