8000 varfolomeeff (Zakhar Varfolomeev) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View varfolomeeff's full-sized avatar

Block or report varfolomeeff

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
8000
Showing results
Python 8 Updated May 22, 2025

Open Source framework for voice and multimodal conversational AI

Python 6,207 887 Updated May 24, 2025

Все необходимые материалы для "Лучшего курса по Питону"

C 274 17 Updated Feb 25, 2025

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 1,094 71 Updated May 24, 2025

Generative Models by Stability AI

Python 25,907 2,874 Updated May 20, 2025

🖼 A collection of high-quality anime faces.

Python 418 35 Updated Jun 26, 2022

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,696 122 Updated Jul 5, 2024

Transparent proxy server that works as a poor man's VPN. Forwards over ssh. Doesn't require admin. Works with Linux and MacOS. Supports DNS tunneling.

Python 12,319 757 Updated Apr 25, 2025
Python 5 Updated Oct 7, 2024

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 207 11 Updated May 7, 2025

Foundational Model for Speech Recognition Tasks

Python 204 25 Updated Mar 4, 2025

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 764 83 Updated Apr 1, 2025

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Python 491 30 Updated May 19, 2025

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis

Python 26 4 Updated Mar 21, 2025

This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).

Python 75 9 Updated May 22, 2025

Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"

Python 160 13 Updated Sep 19, 2024

Towards Human-Sounding Speech

Python 4,828 389 Updated May 6, 2025

UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

118 2 Updated Feb 28, 2025

Open TTS models, built for streaming on the edge

Jupyter Notebook 43 4 Updated Mar 16, 2025

Atomic CSS toolkit with Sass and ergonomics for creating styles of any complexity

SCSS 165 2 Updated Apr 25, 2025

Transcribe music into lead sheets!

Python 373 80 Updated May 14, 2025

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1,919 241 Updated Jun 6, 2024
Python 4,303 349 Updated Mar 12, 2025

CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages [ACL 2025]

Python 151 6 Updated May 11, 2025

Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Python 1,640 178 Updated May 10, 2025

Pytorch implementation of automatic music transcription method that uses a two-level hierarchical frequency-time Transformer architecture (hFT-Transformer).

Python 101 11 Updated Jul 11, 2023

[NeurIPS 2024] Image Understanding Makes for A Good Tokenizer for Image Generation

Python 13 Updated Dec 17, 2024

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 5,001 553 Updated May 15, 2025
Next
0