GitHub - sunilsivadas/s3prl: Self-Supervised Speech Pre-training and Representation Learning Toolkit.

What's New

Jan 2021: Readme updated with detailed instructions on how to use our latest version!
Dec 2020: We are migrating to a newer version for a more general, flexible, and scalable code. See the introduction below for more information! The legacy verison can be accessed by checking out to the tag v0.1.0: git checkout v0.1.0.

Introduction

This is an open source toolkit called S3PRL, which stands for Self-Supervised Speech Pre-training and Representation Learning.
In this toolkit, various upstream self-supervised speech models are available with easy-to-load setups, and downstream evaluation tasks are available with easy-to-use scripts.
Below is an intuitive illustration on how this toolkit may help you:

View the list of upstreams we support: Upstream README
View the list of downstreams we support: Downstream README

Feel free to use or modify our toolkit in your research, any bug report or improvement suggestion will be appreciated.
If you have any questions, please open up a new issue.
If you find this toolkit helpful to your research, please do consider to cite our papers, thanks!

List of papers that used our toolkit (Feel free to add your own paper by making a pull request)

Self-Supervised Pretraining

Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders (Liu et al., 2020)

@article{mockingjay,
   title={Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders},
   ISBN={9781509066315},
   url={http://dx.doi.org/10.1109/ICASSP40776.2020.9054458},
   DOI={10.1109/icassp40776.2020.9054458},
   journal={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
   publisher={IEEE},
   author={Liu, Andy T. and Yang, Shu-wen and Chi, Po-Han and Hsu, Po-chun and Lee, Hung-yi},
   year={2020},
   month={May}
}

TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech (Liu et al., 2020)

@misc{tera,
    title={TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech},
    author={Andy T. Liu and Shang-Wen Li and Hung-yi Lee},
    year={2020},
    eprint={2007.06028},
    archivePrefix={arXiv},
    primaryClass={eess.AS}
}

Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation (Chi et al., 2020)

Explanability

Understanding Self-Attention of Self-Supervised Audio Transformers (Yang et al., 2020)

@misc{understandingSAT,
    title={Understanding Self-Attention of Self-Supervised Audio Transformers},
    author={Shu-wen Yang and Andy T. Liu and Hung-yi Lee},
    year={2020},
    eprint={2006.03265},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Adversarial Attack
- Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning (Wu et al., 2020), code for computing LNSR: utility/observe_lnsr.py
```
@misc{mockingjay_defense,
    title={Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning},
    author={Haibin Wu and Andy T. Liu and Hung-yi Lee},
    year={2020},
    eprint={2006.03214},
    archivePrefix={arXiv},
    primaryClass={eess.AS}
}
```
- Adversarial Defense for Automatic Speaker Verification by Cascaded Self-Supervised Learning Models (Wu et al., 2021)

Installation

Python >= 3.6
PyTorch version >= 1.7.0
For pre-training new upstream models, you'll also need high-end GPU(s).
To develop locally, install s3prl by:

git clone https://github.com/s3prl/s3prl.git
cd s3prl
pip install -r requirements.txt

To use upstream models with the hub interface, cloning this repo is not required, only the requirements.txt need to be met.

Back to Top

Using upstreams

Instructions are documented here: Upstream README

Back to Top

Using downstreams

Warning: we are still developing and testing some downstream tasks, documentation of a task will be added once it has been fully tested.
Instructions are documented here: Downstream README

Back to Top

Train upstream models

If you wish to train your own upstream models, please follow the instructions here: Pretrain README

Back to Top

Development pattern for contributors

Create a personal fork of the main S3PRL repository in GitHub.
Make your changes in a named branch different from master, e.g. you create a branch new-awesome-feature.
Contact us if you have any questions during development.
Generate a pull request through the Web interface of GitHub.
Please verify that your code is free of basic mistakes, we appreciate any contribution!

Back to Top

Reference Repos

Pytorch, Pytorch.
Audio, Pytorch.
Kaldi, Kaldi-ASR.
Transformers, Hugging Face.
PyTorch-Kaldi, Mirco Ravanelli.
fairseq, Facebook AI Research.
CPC, Facebook AI Research.
APC, Yu-An Chung.
NPC, Alexander-H-Liu.

Back to Top

Citation

The S3PRL Toolkit:

@misc{S3PRL,
  author = {Andy T. Liu and Yang Shu-wen},
  title = {S3PRL: The Self-Supervised Speech Pre-training and Representation Learning Toolkit},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  url = {https://github.com/s3prl/s3prl}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1,256 Commits
downstream		downstream
file		file
preprocess		preprocess
pretrain		pretrain
src		src
upstream		upstream
utility		utility
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
optimizers.py		optimizers.py
requirements.txt		requirements.txt
run_downstream.py		run_downstream.py
run_pretrain.py		run_pretrain.py
run_while.sh		run_while.sh
schedulers.py		schedulers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What's New

Introduction

Table of Contents

Installation

Using upstreams

Using downstreams

Train upstream models

Development pattern for contributors

Reference Repos

Citation

About

Releases

Packages

Languages

License

sunilsivadas/s3prl

Folders and files

Latest commit

History

Repository files navigation

What's New

Introduction

Table of Contents

Installation

Using upstreams

Using downstreams

Train upstream models

Development pattern for contributors

Reference Repos

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages