10000 rickltt (dushuren) / Starred · GitHub

More Web Proxy on the site http://driver.im/

rickltt

Follow

dushuren rickltt

Follow

10 followers · 12 following

Southern University of Science and Technology
Shenzhen, China

Stars

ZBang / USEF-TSE

Python 20 3 Updated Nov 2, 2024

smeetrs / deep_avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

Python 234 41 Updated Feb 15, 2024

mpc001 / auto_avsr

Auto-AVSR: Lip-Reading Sentences Project

Python 352 55 Updated Jan 8, 2025

Chris10M / Lip2Speech

A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.

Python 85 21 Updated Nov 25, 2021

facebookresearch / VisualVoice

Audio-Visual Speech Separation with Cross-Modal Consistency

Python 232 38 Updated Jul 25, 2023

ahmadikalkhorani / AVCrossNet

Python 7 1 Updated Jul 4, 2024

modelscope / FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 409 32 Updated Jan 25, 2024

AbrahamSanders / codec-bpe

Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs

Python 62 7 Updated Jun 22, 2025

asteroid-team / asteroid

The PyTorch-based audio source separation toolkit for researchers

Python 2,407 437 Updated Jan 11, 2025

facebookresearch / encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,724 329 Updated Jan 4, 2024

JorisCos / LibriMix

An open source dataset for source separation

Python 430 71 Updated Feb 9, 2024

lucidrains / rotary-embedding-torch

Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

Python 699 58 Updated Nov 27, 2024

wangmou21 / abcs

Air and bone conduction speech

18 Updated Nov 26, 2022

cogmhear / avse_challenge

Forked from claritychallenge/clarity

COG-MHEAR Audio-Visual Speech Enhancement Challenge

Python 40 12 Updated May 7, 2025

bsxfan / PYLLR

Python toolkit for likelihood-ratio calibration of binary classifiers

Python 27 9 Updated Feb 21, 2023

karolpiczak / ESC-50

ESC-50: Dataset for Environmental Sound Classification

Python 1,587 301 Updated Mar 20, 2024

ddlBoJack / Speech-Resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

560 68 Updated Nov 13, 2024

bytedance / MegaTTS3

Python 5,577 416 Updated May 11, 2025

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 4,083 463 Updated Apr 15, 2025

skit-ai / SpeechLLM

This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingface.

Python 110 8 Updated Jun 25, 2024

ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 575 53 Updated Jun 9, 2024

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 583 33 Updated Jun 2, 2025

TaoRuijie / SEANet

Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)

Python 20 Updated Feb 28, 2025

Rongjiehuang / TranSpeech

PyTorch Implementation of TranSpeech (ICLR'23): Textless NAR Speech-to-Speech Translation with Bilateral Perturbation

Python 171 23 Updated Jun 20, 2024

wenet-e2e / wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python 943 143 Updated May 19, 2025

IDRnD / redimnet

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python 165 10 Updated Nov 14, 2024

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,239 2,288 Updated Jun 20, 2025

deepseek-ai / DeepSeek-R1

90,276 11,653 Updated Apr 9, 2025

clovaai / voxceleb_trainer

In defence of metric learning for speaker recognition

Python 1,111 284 Updated Mar 26, 2024

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 10,047 1,519 Updated Jun 18, 2025

0