-
scistor
- Beijing,china
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
Voice activity detection (VAD) library, based on WebRTC's VAD engine
A Low-Latency, Lightweight and High-Performance Streaming VAD
A TTS model capable of generating ultra-realistic dialogue in one pass.
This is the repo of our work titled “Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception”
Learning Rust By Practice, narrowing the gap between beginner and skilled-dev through challenging examples, exercises and projects.
持续分享/翻译 AI 领域的优秀内容,帮你战胜 AI,Just beat it! 欢迎 star 订阅,记住域名不迷路 https://BeatAI.cn
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
[Information Fusion'2025] Hierarchical multi-source cues fusion for mono-to-binaural based Audio Deepfake Detection
Morse Code Decoder & Detector with Deep Learning
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
Command-line program to download videos from YouTube.com and other video sites
SSL Layerwise analysis for speech deepfake detection
🕵️♂️🔊 Automatically update Audio Deepfake Detection (ADD) papers daily using GitHub Actions (updates every 12 hours)
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …