Speech2Text trying out state of the art speech to text models on youtube videos Resources Wav2Vec2-Base-960h Wav2Vec2-Large-Robust finetuned on Librispeech