8000 GitHub - seongq/flowmse: flow matching based speech enhancement
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

seongq/flowmse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flow matching based speech enhancement

This repository contains the official PyTorch implementations for the 2025 paper:

  • FlowSE: Flow Matching-based Speech Enhancement [1]

FlowSE fig1

YouTube English Video

Presentation video [english], Presentation video [korean]

Speech examples are available on our [DEMOpage](https://seongqjini.com/speech-enhancement-with-flow-matching-method/).

This repository builds upon previous great works:

Installation

  • Create a new virtual environment with Python 3.10 (we have not tested other Python versions, but they may work).
  • Install the package dependencies via pip install -r requirements.txt.
  • W&B is required.

Training

Training is done by executing train.py. A minimal running example with default settings (as in our paper [1]) can be run with

python train.py --base_dir <your_dataset_dir>

where your_dataset_dir should be a containing subdirectories train/ and valid/ (optionally test/ as well).

Each subdirectory must itself have two subdirectories clean/ and noisy/, with the same filenames present in both. We currently only support training with .wav files.

To get the training set WSJ0-CHIME3, we refer to https://github.com/sp-uhh/sgmse and execute create_wsj0_chime3.py.

To see all available training options, run python train.py --help.

Evaluation

To evaluate on a test set, run

python enhancement.py --test_dir <your_test_dataset_dir> --folder_destination <your_enh_result_save_dir> --ckpt <path_to_model_checkpoint> --N <num_of_time_steps>

your_test_dataset_dir should contain a subfolder test which contains subdirectories clean and noisy. clean and noisy should contain .wav files.

Citations / References

[1] Seonggyu Lee, Sein Cheong, Sangwook Han, Jong Won Shin. FlowSE: Flow Matching-based Speech Enhancement, ICASSP, 2025.

@INPROCEEDINGS{10888274,
  author={Seonggyu Lee and Sein Cheong and Sangwook Han and Jong Won Shin},
  booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={FlowSE: Flow Matching-based Speech Enhancement}, 
  year={2025},
  doi={10.1109/ICASSP49660.2025.10888274}}

About

flow matching based speech enhancement

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0