-
MIT
- USA
-
01:42
(UTC -04:00) - https://www.csail.mit.edu/person/nauman-dawalatabad
- in/nauman-daw
- @NaumanDawalatab
Stars
A collection of AWESOME things about domian adaptation
Re-implementation of SLAM-ASR paper's experiment, using Phi-2 and Hubert
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
A python package to analyze and compare voices with deep learning
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
In defence of metric learning for speaker recognition
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
A resource for learning about Machine learning & Deep Learning
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Example code for a neural transducer model.
Learn System Design concepts and prepare for interviews using free resources.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Python interface to the WebRTC Voice Activity Detector
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
🔊 Text-Prompted Generative Audio Model
All Algorithms implemented in Python
Align word sequences and calculate metrics like word error rate (WER)
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Meditron is a suite of open-source medical Large Language Models (LLMs).