8000 Release v1.5 · bklynhlth/openwillis · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

v1.5

Compare
Choose a tag to compare
@vjbytes102 vjbytes102 released this 13 Oct 16:53
· 563 commits to main since this release
3c4af9f

OpenWillis v1.5

Release date: Thursday Oct 5th, 2023

Version 1.5 brings refined methods for speech transcription and speaker separation. OpenWillis is now able to use Whisper for speech transcription. This integration ensures consistent transcription accuracy, whether processed locally or on cloud-based servers, and introduces support for multiple languages.

If you have feedback or questions, please reach out.

Contributors

vjbytes102
anzarabbas

General updates

The speech transcription and speaker separation functions have been updated to allow for a processing workflow similar to that of the cloud-based speech transcription and speaker separation functions through the integration of Whisper as one of the transcription models available. This also prompted a revision to the Speech Characteristics function so that it may support JSON files produced by Whisper.

Speech transcription v2.0

The new speech transcription function can now use WhisperX to transcribe speech to text, which can label speakers in the case of multiple speakers and has integrated speaker identification in case of structured clinical interviews.

Speaker separation v2.0

The speaker separation function has been updated to support JSON files with labeled speakers that the user can now obtain by leaning on WhisperX during speech transcription. In this scenario, it simply splits the speakers based on the labels in the JSON file.

Speech characteristics v2.1

The speech characteristics function now supports JSON files acquired through WhisperX. All output variables remain the same.

0