-
GPT-SoVITS-V3-Infer-API Public
Forked from CNFlyCat/GPT-SoVITS-V3-Infer-APIConvenient for developers to call inference models from version v1 to v3 through API, supporting streaming transmission and specified type file transfer.
Python MIT License UpdatedFeb 19, 2025 -
S3Tokenizer Public
Forked from xingchensong/S3TokenizerReverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
Python Apache License 2.0 UpdatedOct 12, 2024 -
-
vits_chinese Public
Forked from PlayVoice/vits_chineseBest TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Also for voice clone!
Python UpdatedMar 20, 2023 -
espnet Public
Forked from espnet/espnetEnd-to-End Speech Processing Toolkit
Python Apache License 2.0 UpdatedSep 8, 2022 -
metrics Public
Forked from Lightning-AI/torchmetricsMachine learning metrics for distributed, scalable PyTorch applications.
Python Apache License 2.0 UpdatedJul 27, 2022 -
speechbrain Public
Forked from speechbrain/speechbrainA PyTorch-based Speech Toolkit
Python Apache License 2.0 UpdatedMar 25, 2022 -
DNS-Challenge Public
Forked from microsoft/DNS-ChallengeThis repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
Python Creative Commons Attribution 4.0 International UpdatedMar 22, 2022 -
BLOOM-Net Public
Forked from kimsunwiub/BLOOM-NetSource code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
Python MIT License UpdatedFeb 13, 2022 -
HGCN Public
Forked from wangtianrui/HGCNThe official repo of "HGCN: Harmonic Gated Compensation Network For Speech Enhancement"
Python UpdatedJan 31, 2022 -
FullSubNet Public
Forked from Audio-WestlakeU/FullSubNetPyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Python MIT License UpdatedJan 21, 2022 -
AFRCNN-For-Speech-Separation Public
Forked from JusperLee/AFRCNN-For-Speech-SeparationSpeech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
-
sudo_rm_rf Public template
Forked from etzinis/sudo_rm_rfCode for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of sep…
Jupyter Notebook MIT License UpdatedNov 21, 2021 -
aps Public
Forked from funcwj/apsA personal toolkit for single/multi-channel speech recognition & enhancement & separation.
Python Apache License 2.0 UpdatedNov 5, 2021 -
voicefixer_main Public
Forked from haoheliu/voicefixer_mainGeneral Speech Restoration
Python GNU Affero General Public License v3.0 UpdatedNov 2, 2021 -
conformer Public
Forked from sooftware/conformerPyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Python Apache License 2.0 UpdatedOct 9, 2021 -
generative_inpainting Public
Forked from JiahuiYu/generative_inpaintingDeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral
Python Other UpdatedAug 29, 2021 -
pyloudnorm Public
Forked from csteinmetz1/pyloudnormFlexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
Python MIT License UpdatedAug 28, 2021 -
-
DeepLearning-500-questions Public
Forked from scutan90/DeepLearning-500-questions深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
JavaScript GNU General Public License v3.0 UpdatedMay 30, 2021 -
deep_avsr Public
Forked from smeetrs/deep_avsrA PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
Python MIT License UpdatedMay 20, 2021 -
hifi-gan Public
Forked from jik876/hifi-ganHiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Python MIT License UpdatedApr 28, 2021 -
SpecAugment Public
Forked from DemisEom/SpecAugmentA Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Python Apache License 2.0 UpdatedApr 27, 2021 -
performer-pytorch Public
Forked from lucidrains/performer-pytorchAn implementation of Performer, a linear attention-based transformer, in Pytorch
Python MIT License UpdatedApr 21, 2021 -
MetricGAN Public
Forked from JasonSWFu/MetricGANMetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement (ICML 2019, with Travel awards)
MATLAB UpdatedApr 19, 2021 -
hifigan-denoiser Public
Forked from rishikksh20/hifigan-denoiserHiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Python Apache License 2.0 UpdatedApr 8, 2021 -
MS-SNSD Public
Forked from microsoft/MS-SNSDThe Microsoft Scalable Noisy Speech Dataset (MS-SNSD) is a noisy speech dataset that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) l…
HTML MIT License UpdatedMar 15, 2021 -
torch-dct Public
Forked from zh217/torch-dctDCT (discrete cosine transform) functions for pytorch
Python MIT License UpdatedMar 9, 2021 -
-
pytorch-inpainting-with-partial-conv Public
Forked from naoto0804/pytorch-inpainting-with-partial-convUnofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions' [Liu+, ECCV2018]
Python MIT License UpdatedDec 4, 2020