Stars
Python library for calculating the mean opinion score and 95% confidence interval of the standard deviation of text-to-speech ratings according to Ribeiro et al. (2011).
LLM Hands On Workshop, delivered by Ryan Daniels
Simple web interface to be hosted on 'devilcat' for human evaluation studies
A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, with GPU.
timohromadka / BigVGAN
Forked from NVIDIA/BigVGANOfficial PyTorch implementation of BigVGAN (ICLR 2023)
Vector Quantized VAEs - PyTorch Implementation
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Get an introduction to how large language models work, and how to get up and running quickly.
COLMAP - Structure-from-Motion and Multi-View Stereo
♞ lichess.org: the forever free, adless and open source chess server ♞
Software synthesizer based on the SoundFont 2 specifications
Generative models for conditional audio generation
Rubik's cube learning tool to efficiently, effectively, and explainably generate 3-style algorithms (commutators) to take blindfold cubing to the next level.
A SvelteKit template with DaisyUI
3D Gaussian Splatting Renderer for WebGL
Three.js-based implementation of 3D Gaussian splatting
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Pytorch port of Google Research's VGGish model used for extracting audio features.
Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Trainer for audio-diffusion-pytorch
This repo contains code for comparing audio representation sin the task of audio synthesis wth Generative Adversarial Networks (GAN)
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
A repository for generating and training short audio samples with unconditional waveform diffusion on accessible consumer hardware (<2GB VRAM GPU)