- Barcelona
- https://medium.com/@santi.pdp
Stars
Reformer, the efficient Transformer, in Pytorch
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A Pytorch Implementation of MelGAN
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Pronounced as "musician", musicnn is a set of pre-trained deep convolutional neural networks for music audio tagging.
Reference implementation of real-time autoregressive wavenet inference
CMU Wilderness Multilingual Speech Dataset
This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of acoustic impulse responses.
imatge-upc / wav2pix
Forked from miqueltubau/Wav2PixSpeech-conditioned face generation using Generative Adversarial Networks (ICASSP 2019)
Speech-conditioned face generation using Generative Adversarial Networks
Speech Enhancement Generative Adversarial Network in PyTorch
Keras implementation of Representation Learning with Contrastive Predictive Coding
SincNet is a neural architecture for efficiently processing raw audio samples.
Real NVP PyTorch a Minimal Working Example | Normalizing Flow
A WebGL accelerated JavaScript library for training and deploying ML models.
A vocoder framework which had been widely used in research community since 1999.
Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
A WaveNet-based vocoder for fast inference
Reimplementation of Variational Inference with Normalizing Flows (https://arxiv.org/abs/1505.05770)
Bachelor's thesis carried at Universitat Politecnica de Catalunya in partial fullfilment of the requirements for the degree in Telecommunications Technologies and Services Engineering
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI
Pytorch implementation of Self-Attention Generative Adversarial Networks (SAGAN)
This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
Speaker recognition baseline for PAV subject in ETSETB UPC (Telecom BCN)
Tacotron 2 - PyTorch implementation with faster-than-realtime inference