rohithkodali

rohithkodali

12 followers · 11 following

Achievements

proneval Public
Forked from KoelLabs/ML

Koel Labs innovates real-time pronunciation feedback for language learners! This repo contains the ML training, evaluation, and data processing code

Jupyter Notebook GNU Affero General Public License v3.0 Updated Jan 14, 2025
watermark-detection Public
Forked from boomb0om/watermark-detection

Model for watermark classification implemented with PyTorch

Jupyter Notebook Updated Sep 19, 2024
LookOnceToHear Public
Forked from vb000/LookOnceToHear

A novel human-interaction method for real-time speech extraction on headphones.

Python Other Updated May 10, 2024
Whisper-Hindi-ASR-model-IIT-Bombay-Intership Public

The Whisper Hindi ASR (Automatic Speech Recognition) model utilizes the KathBath dataset, a comprehensive collection of speech samples in Hindi. Trained on this dataset, Whisper employs advanced de…

Jupyter Notebook Eclipse Public License 2.0 Updated Apr 23, 2024
supervoice-dataset Public
Forked from ex3ndr/supervoice-librilight-preprocessed

60k hours of phoneme-aligned audio from audio books

Python Updated Apr 12, 2024
Amphion Public
Forked from open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python MIT License Updated Nov 28, 2023
ConsistencyVC-voive-conversion Public
Forked from ConsistencyVC/ConsistencyVC-voive-conversion

Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

Python MIT License Updated Oct 16, 2023
MLnotebook Public
Forked from udlbook/udlbook

Understanding Deep Learning - Simon J.D. Prince

Jupyter Notebook Other Updated Oct 13, 2023
PhonoQ Public
Forked from TAriasVergara/PhonoQ

PhonoQ is a deep learning model used to compute phonetic-based features related to duration, rate, rhythm*, and goodness of pronunciation* of 18 phonological classes

Python MIT License Updated Aug 31, 2023
VoskIdentification Public
Forked from virex-84/VoskIdentification

Тестовый пример задействования модели для идентификации голоса с помощью библиотеки распознавания речи "Vosk" (Воск): https://alphacephei.com/vosk/

Java Updated Aug 14, 2023
FastSAM Public
Forked from CASIA-IVA-Lab/FastSAM

Fast Segment Anything

Python Apache License 2.0 Updated Jul 30, 2023
Real-time-wake-word-detection Public
Forked from matron2017/Real-time-wake-word-detection

Spoken wake-word detection for conversational avatar

Jupyter Notebook Updated Jan 31, 2023
vall-e Public
Forked from enhuiz/vall-e

An unofficial PyTorch implementation of the audio LM VALL-E, WIP

Python MIT License Updated Jan 17, 2023
StyleTTS Public
Forked from yl4579/StyleTTS

Official Implementation of StyleTTS

Python MIT License Updated Jan 9, 2023
recurrent-interface-network-pytorch Public
Forked from lucidrains/recurrent-interface-network-pytorch

Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch

Python MIT License Updated Jan 8, 2023
langdetect Public

langauge detection algorithm that can be expandable to add any number of languages

Python Apache License 2.0 Updated Jan 5, 2023
langchain Public
Forked from langchain-ai/langchain

⚡ Building applications with LLMs through composability ⚡

Python MIT License Updated Jan 3, 2023
self-supervised-phone-segmentation Public
Forked from lstrgar/self-supervised-phone-segmentation

Phoneme segmentation using pre-trained speech models

Python GNU General Public License v3.0 Updated Nov 4, 2022
Deep-Learning-in-Production Public
Forked from ahkarami/Deep-Learning-in-Production

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

Updated Oct 14, 2022
you-only-hear-once Public
Forked from satvik-venkatesh/you-only-hear-once

Jupyter Notebook MIT License Updated Oct 13, 2022
TOI Public

Toi news

Python Updated Jul 6, 2022
ULCA-asr-dataset-corpus Public
Forked from Open-Speech-EkStep/ULCA-asr-dataset-corpus

Creative Commons Attribution 4.0 International Updated Sep 6, 2021
pifuhd Public
Forked from facebookresearch/pifuhd

High-Resolution 3D Human Digitization from A Single Image.

Python Other Updated Nov 8, 2020
transformer-cnn-emotion-recognition Public
Forked from IliaZenkov/transformer-cnn-emotion-recognition

Speech Emotion Classification with novel Parallel CNN-Transformer model built with PyTorch, plus thorough explanations of CNNs, Transformers, and everything in between

Jupyter Notebook 1 MIT License Updated Nov 6, 2020
conv-emotion Public
Forked from declare-lab/conv-emotion

This repo contains implementation of different architectures for emotion recognition in conversations

Python MIT License Updated Feb 5, 2020
ddsp Public
Forked from magenta/ddsp

DDSP: Differentiable Digital Signal Processing

Python Apache License 2.0 Updated Jan 16, 2020
whisper-to-normal-speech-conversion Public
Forked from Maitreyapatel/speech-conversion-between-different-modalities

Whisper-to-Normal Speech Conversion Using Generative Adversarial Networks

Python MIT License Updated Jan 2, 2020
Nepali-Ai-Anchor Public
Forked from kshitijsubedi/Nepali-Ai-Anchor

Nepali AI Anchor Using LSTM & Pix2Pix. [ Itonics Hackathon 2019]

Python Updated Dec 15, 2019
melgan-neurips Public
Forked from descriptinc/melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Python MIT License Updated Oct 26, 2019
Resemblyzer Public
Forked from resemble-ai/Resemblyzer

A python package to analyze and compare voices with deep learning

Python Apache License 2.0 Updated Oct 23, 2019

rohithkodali

Achievements

Achievements

proneval Public

Uh oh!

watermark-detection Public

Uh oh!

LookOnceToHear Public

Uh oh!

Whisper-Hindi-ASR-model-IIT-Bombay-Intership Public

Uh oh!

supervoice-dataset Public

Uh oh!

Amphion Public

Uh oh!

ConsistencyVC-voive-conversion Public

Uh oh!

MLnotebook Public

Uh oh!

PhonoQ Public

Uh oh!

VoskIdentification Public

Uh oh!

FastSAM Public

Uh oh!

Real-time-wake-word-detection Public

Uh oh!

vall-e Public

Uh oh!

StyleTTS Public

Uh oh!

recurrent-interface-network-pytorch Public

Uh oh!

langdetect Public

Uh oh!

langchain Public

Uh oh!

self-supervised-phone-segmentation Public

Uh oh!

Deep-Learning-in-Production Public

Uh oh!

you-only-hear-once Public

Uh oh!

TOI Public

Uh oh!

ULCA-asr-dataset-corpus Public

Uh oh!

pifuhd Public

Uh oh!

transformer-cnn-emotion-recognition Public

Uh oh!

conv-emotion Public

Uh oh!

ddsp Public

Uh oh!

whisper-to-normal-speech-conversion Public

Uh oh!

Nepali-Ai-Anchor Public

Uh oh!

melgan-neurips Public

Uh oh!

Resemblyzer Public

Uh oh!