8000 Release v1.6 · bklynhlth/openwillis · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

v1.6

Compare
Choose a tag to compare
@vjbytes102 vjbytes102 released this 14 Nov 23:54
· 525 commits to main since this release
a89e27a

OpenWillis v1.6


Release date: Wednesday November 15th, 2023


Version 1.6 brings significant changes leading to flexibility in speech transcription, speaker separation, and subsequent quantification of speech characteristics. 


The user is now able to easily choose between different models for speech transcription and separate audio files with multiple speakers regardless of speech transcription model used. The speech characteristics function has been updated to support outputs from any of these routes.


If you have feedback or questions, please reach out.


Contributors



General updates


There are now three separate speech transcription functions: one using Vosk, one using WhisperX, and one using Amazon Transcribe, each with its own pros and cons as described below.


Speech Transcription with Vosk Speech transcription conducted locally on a user’s machine; needs fewer computational resources but is less accurate
Speech Transcription with Whisper Speech transcription conducted locally on a user’s machine; needs greater computational resources but is more accurate
Speech Transcription with AWS Speech transcription conducted via the Amazon Transcribe API; requires (typically) paid access to the API and AWS resources

Finally, the Speech Characteristics function has been updated to support JSON transcripts from each of the speech transcription functions. It also contains bug fixes that were leading to certain variables not being calculated in certain contexts.


0