This is a community based approach to an implementation mostly for practice. I will implement the model architecture as defined in the paper but will leave someone else to implement the training script! So please create a training script if you have the time and energy
$ pip3 install -U audio-xlstm
MIT
- Implement the flip module
- Correctly leverage msltm module
- Ensure model architecture is correct
- Implement training script on whisper like data
- Implement speech and audio recognition datasets
@article{xlstm,
title={xLSTM: Extended Long Short-Term Memory},
author={Beck, Maximilian and P{\"o}ppel, Korbinian and Spanring, Markus and Auer, Andreas and Prudnikova, Oleksandra and Kopp, Michael and Klambauer, G{\"u}nter and Brandstetter, Johannes and Hochreiter, Sepp},
journal={arXiv preprint arXiv:2405.04517},
year={2024}
}