8000 bigpon (Yi-Chiao WU) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View bigpon's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report bigpon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

Python 141 4 Updated Jun 6, 2025

Interactive visualizations of the geometric intuition behind diffusion models.

Svelte 781 32 Updated Jun 17, 2025

Versatile Evaluation of Speech and Audio

Python 288 31 Updated Jul 1, 2025

PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing

Python 62 4 Updated Jun 1, 2025

Audio processing by using pytorch 1D convolution network

Python 1,073 93 Updated May 16, 2025

An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.

Python 178 14 Updated Mar 22, 2025
Python 4,392 358 Updated Jun 12, 2025

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling

Python 142 21 Updated Feb 28, 2025

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 281 33 Updated Jun 15, 2025

Unified automatic quality assessment for speech, music, and sound.

Python 526 37 Updated Jun 5, 2025

SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios

Python 237 23 Updated Jan 22, 2025

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Python 282 45 Updated May 22, 2025

Code and data recipes for the paper: Heterogeneous Target Speech Separation

Python 42 1 Updated Dec 6, 2022

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

Python 2,079 204 Updated Jul 2, 2025

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 2,921 166 Updated May 28, 2025

A library for soundscape synthesis and augmentation

Python 401 65 Updated May 4, 2022

The PyTorch-based audio source separation toolkit for researchers

Python 2,410 436 Updated Jan 11, 2025

ModelScope: bring the notion of Model-as-a-Service to life.

Python 8,073 834 Updated Jul 4, 2025

This is the audio sample repository for speech separation model "MossFormer2".

Python 132 9 Updated Nov 28, 2024

Target Speaker Extraction Toolkit

Python 179 20 Updated Jul 4, 2025
Python 172 23 Updated Dec 5, 2024

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Python 485 77 Updated May 26, 2023

wsj0-{2, 3, 4, 5} mix generation scripts, in Python.

Python 60 6 Updated Mar 17, 2021

An open source dataset for source separation

Python 431 71 Updated Feb 9, 2024

The official Implementation of PeriodWave and PeriodWave-Turbo

Python 200 13 Updated Apr 14, 2025

Generation scripts for EARS-WHAM and EARS-Reverb

Python 34 4 Updated Jul 4, 2025

FMA: A Dataset For Music Analysis

Jupyter Notebook 2,423 452 Updated Jan 5, 2023

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

586 24 Updated May 8, 2025

PAM is a no-reference audio quality metric for audio generation tasks

Python 65 6 Updated Jul 19, 2024

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Python 808 135 Updated Dec 1, 2024
Next
0