8000 GitHub - brkgyln/whisper-plus: WhisperPlus: Advancing Speech-to-Text Processing 🚀
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

brkgyln/whisper-plus

 
 

Repository files navigation

WhisperPlus: Advancing Speech-to-Text Processing 🚀

teaser

🛠️ Installation

pip install whisperplus

🤗 Model Hub

You can find the models on the HuggingFace Spaces or on the HuggingFace Model Hub

🎙️ Usage

To use the whisperplus library, follow the steps below for different tasks:

🎵 Youtube URL to Audio

from whisperplus import SpeechToTextPipeline, download_and_convert_to_mp3

url = "https://www.youtube.com/watch?v=di3rHkEZuUw"
video_path = download_and_convert_to_mp3(url)
pipeline = SpeechToTextPipeline(model_id="openai/whisper-large-v3")
transcript = pipeline(
    audio_path=video_path, model_id="openai/whisper-large-v3", language="english
)

return transcript

### Contributing

pip install -r dev-requirements.txt
pre-commit install
pre-commit run --all-files

📜 License

This project is licensed under the terms of the Apache License 2.0.

🤗 Acknowledgments

This project is based on the HuggingFace Transformers library.

🤗 Citation

@misc{radford2022whisper,
  doi = {10.48550/ARXIV.2212.04356},
  url = {https://arxiv.org/abs/2212.04356},
  author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  title = {Robust Speech Recognition via Large-Scale Weak Supervision},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

About

WhisperPlus: Advancing Speech-to-Text Processing 🚀

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%
0