8000 980202006 / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View 980202006's full-sized avatar

Block or report 980202006

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis

Python 32 4 Updated Jun 25, 2025

The TTSDS benchmark evaluates synthetic speech quality by considering prosody, speaker identity, and intelligibility, comparing these factors with real speech and noise datasets.

Python 45 2 Updated Jun 21, 2025

Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"

Python 19 2 Updated Jun 27, 2025

Awesome Unified Multimodal Models

355 10 Updated Jun 27, 2025

OmniGen2: Exploration to Advanced Multimodal Generation.

Jupyter Notebook 2,019 146 Updated Jun 27, 2025

ZIQI-Eval: A Music Evaluation Benchmark for Large Language Models

Python 12 1 Updated Jul 23, 2024

Official repository for the paper - SLAP: Siamese Language-Audio Pretraining without negative samples for Music Understanding

Python 16 1 Updated Jun 21, 2025

Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)

Python 118 11 Updated Dec 5, 2024

《大语言模型》作者:赵鑫,李军毅,周昆,唐天一,文继荣

Python 3,732 274 Updated Mar 31, 2025
Python 519 57 Updated Jun 25, 2025

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 181 13 Updated Jun 26, 2025

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

Python 821 77 Updated Jun 25, 2025

Mel cepstral distortion (MCD) computations in python.

Python 224 35 Updated Jun 13, 2017

LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM

Jupyter Notebook 17 2 Updated May 17, 2024

A low-bitrate single-codebook 16 kHz speech codec based on focal modulation

Python 92 11 Updated Feb 12, 2025
Python 33 4 Updated May 13, 2024

JiOu-LLM: 基于llama2的奇偶数判别模型

Python 5 1 Updated Mar 11, 2024
Python 55 5 Updated Jun 22, 2025
Python 588 57 Updated Jun 25, 2025
Python 47 4 Updated Aug 27, 2024

MutiModel paper reading (Visual, Audio)

11 Updated Jun 22, 2025
Python 3 Updated Feb 9, 2025

A collection of literature after or concurrent with Masked Autoencoder (MAE) (Kaiming He el al.).

836 53 Updated Jul 10, 2024

喜马拉雅专辑音频一键下载工具

JavaScript 1,145 155 Updated Feb 15, 2025

Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.

Python 496 30 Updated Jun 14, 2025
Python 12 Updated Jun 24, 2025

Delayed Streams Modeling (DSM) is a flexible formulation for streaming, multimodal sequence-to-sequence learning.

Python 338 28 Updated Jun 27, 2025
Jupyter Notebook 114 7 Updated Jun 20, 2025
Next
0