8000 Georgehappy1 (Georgehappy1) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Georgehappy1's full-sized avatar

Block or report Georgehappy1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Ke-Omni-R is an advanced audio reasoning model and achieved SOTA on MMAU

Python 15 Updated Apr 29, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 3,574 224 Updated May 8, 2025

llm & rl

Jupyter Notebook 120 12 Updated May 12, 2025
Python 50 5 Updated Apr 1, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,933 221 Updated May 15, 2025

A Conversational Speech Generation Model

Python 13,229 1,253 Updated Mar 27, 2025
Python 722 32 Updated Apr 18, 2025

UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

118 2 Updated Feb 28, 2025

Audio-FLAN

150 4 Updated Mar 6, 2025

A low-bitrate single-codebook 16 kHz speech codec based on focal modulation

Python 86 10 Updated Feb 12, 2025

Simple RL training for reasoning

Python 3,562 265 Updated Apr 10, 2025

Unified automatic quality assessment for speech, music, and sound.

Python 484 31 Updated May 1, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,337 155 Updated Mar 20, 2025

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 4,969 544 Updated May 15, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,814 129 Updated May 13, 2025

Fully open reproduction of DeepSeek-R1

Python 24,421 2,250 Updated May 15, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,428 1,402 Updated May 15, 2025

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 265 32 Updated Mar 12, 2025

Align Anything: Training All-modality Model with Feedback

Jupyter Notebook 3,675 430 Updated May 1, 2025

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 433 24 Updated May 15, 2025

PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.

Python 1,097 153 Updated Apr 3, 2025

LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation with Spoken Language Models" (arXiv 2024).

65 1 Updated Dec 28, 2024

Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"

Python 159 13 Updated Sep 19, 2024

An Open-Sourced LLM-empowered Foundation TTS System

Python 699 56 Updated Apr 15, 2025

[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer

Python 57 2 Updated Nov 1, 2024

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

Python 79 5 Updated Dec 20, 2024

Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models

Python 159 11 Updated May 15, 2025

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

552 67 Updated Nov 13, 2024

SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Python 255 7 Updated Dec 29, 2024

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 316 20 Updated Jan 2, 2025
Next
0