8000 lifeiteng (Feiteng) / Starred Β· GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View lifeiteng's full-sized avatar

Block or report lifeiteng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

PyTorch video decoding

Python 534 33 Updated May 5, 2025

Official repo for CFG-Zero*

Python 524 19 Updated May 2, 2025

A novel cross-modal decoupling and alignment framework for multimodal representation learning.

JavaScript 20 1 Updated Mar 19, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 2,913 151 Updated May 6, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,416 1,406 Updated May 6, 2025

Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"

Python 142 1 Updated Apr 22, 2025

Freeware Advanced Audio (AAC) Decoder faad2 mirror

C 186 76 Updated Mar 4, 2025

Full-featured MP4 format, MPEG DASH, HLS, CMAF SDK and tools

C++ 2,153 500 Updated Nov 15, 2024

A python binding for FFmpeg which provides sync and async APIs

Python 334 53 Updated Jul 31, 2024

No-GIL Python environment featuring NVIDIA Deep Learning libraries.

Dockerfile 59 3 Updated Apr 14, 2025

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 202 7 Updated Apr 1, 2025

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 186 10 Updated May 6, 2025

An easy-to-use, fast, and easily integrable tool for evaluating audio LLM

Python 91 2 Updated Apr 17, 2025

The official repository of Dynamic-SUPERB.

Python 180 90 Updated Mar 15, 2025

TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/TokenBridge

Python 107 3 Updated May 6, 2025

Tools for handling speech data in machine learning projects.

Python 1,017 233 Updated May 2, 2025

Unified high-performance Python client for object and file stores.

Python 24 3 Updated May 5, 2025

Terminal string styling done right, in Python 🐍 πŸŽ‰

Python 533 23 Updated Jan 7, 2024

πŸ€— R1-AQA Model: mispeech/r1-aqa

Python 245 21 Updated Mar 28, 2025

Implementation for SimDINO/SimDINOv2

Python 126 9 Updated Mar 15, 2025

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Python 243 25 Updated Mar 20, 2025

Spark-TTS Inference Code

Python 9,077 947 Updated Apr 9, 2025
2 Updated Dec 3, 2024

The python library for real-time communication

JavaScript 3,829 330 Updated Apr 23, 2025

Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".

Python 50 2 Updated Apr 15, 2025

UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

118 2 Updated Feb 28, 2025

Audio-FLAN

142 4 Updated Mar 6, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,516 829 Updated Apr 29, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,225 723 Updated May 4, 2025

A differentiable version of SPTK

Python 182 16 Updated Apr 24, 2025
Next
0