research-article

Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model

Authors:

N. Q.K. Duong,

E. Vincent,

R. GribonvalAuthors Info & Claims

IEEE Transactions on Audio, Speech, and Language Processing, Volume 18, Issue 7

Pages 1830 - 1840

Published: 01 September 2010 Publication History

Abstract

This paper addresses the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a zero-mean Gaussian random variable whose covariance encodes the spatial characteristics of the source. We then consider four specific covariance models, including a full-rank unconstrained model. We derive a family of iterative expectation-maximization (EM) algorithms to estimate the parameters of each model and propose suitable procedures adapted from the state-of-the-art to initialize the parameters and to align the order of the estimated sources across all frequency bins. Experimental results over reverberant synthetic mixtures and live recordings of speech data show the effectiveness of the proposed approach.

Cited By

View all

Ueda TNakatani TIkeshita RAraki SMakino S(2024)DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situationsEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-024-00373-32024:1Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1186/s13636-024-00373-3
Jenrungrot TJayaram VSeitz SKemelmacher-Shlizerman ILarochelle HRanzato MHadsell RBalcan MLin H(2020)The cone of silenceProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497481(20925-20938)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497481
Muñoz-Montoro ARanilla JVera-Candeas PCombarro EAlonso-Jordá P(2019)Real-time SoundprismThe Journal of Supercomputing10.1007/s11227-018-2703-075:3(1594-1609)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11227-018-2703-0
Show More Cited By

Recommendations

Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment

This paper presents a blind source separation method for convolutive mixtures of speech/audio sources. The method can even be applied to an underdetermined case where there are fewer microphones than sources. The separation operation is performed in the ...
Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation

We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where ...
Under-determined reverberant audio source separation using local observed covariance and auditory-motivated time-frequency representation
LVA/ICA'10: Proceedings of the 9th international conference on Latent variable analysis and signal separation

We consider the local Gaussian modeling framework for under-determined convolutive audio source separation, where the spatial image of each source is modeled as a zero-mean Gaussian variable with full-rank time- and frequency-dependent covariance. We ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Audio, Speech, and Language Processing

IEEE Transactions on Audio, Speech, and Language Processing Volume 18, Issue 7

September 2010

211 pages

ISSN:1558-7916

Issue’s Table of Contents

Publisher

IEEE Press

Publication History

Published: 01 September 2010

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Ueda TNakatani TIkeshita RAraki SMakino S(2024)DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situationsEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-024-00373-32024:1Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1186/s13636-024-00373-3
Jenrungrot TJayaram VSeitz SKemelmacher-Shlizerman ILarochelle HRanzato MHadsell RBalcan MLin H(2020)The cone of silenceProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497481(20925-20938)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497481
Muñoz-Montoro ARanilla JVera-Candeas PCombarro EAlonso-Jordá P(2019)Real-time SoundprismThe Journal of Supercomputing10.1007/s11227-018-2703-075:3(1594-1609)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11227-018-2703-0
Abouzid HChakkor OReyes OVentura S(2019)Signal speech reconstruction and noise removal using convolutional denoising audioencoders with neural deep learningAnalog Integrated Circuits and Signal Processing10.1007/s10470-019-01446-6100:3(501-512)Online publication date: 1-Sep-2019
https://dl.acm.org/doi/10.1007/s10470-019-01446-6
Xie YXie KYang JWu ZXie S(2019)Underdetermined Reverberant Audio-Source Separation Through Improved Expectation---Maximization AlgorithmCircuits, Systems, and Signal Processing10.1007/s00034-018-1011-538:6(2877-2889)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s00034-018-1011-5
Rafii ZLiutkus AStoter FMimilakis SFitzGerald DPardo B(2018)An Overview of Lead and Accompaniment Separation in MusicIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2018.282544026:8(1307-1335)Online publication date: 1-Aug-2018
https://dl.acm.org/doi/10.1109/TASLP.2018.2825440
Leglaive SBadeau RRichard G(2018)Student's t Source and Mixing Models for Multichannel Audio Source SeparationIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2018.281301126:6(1150-1164)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1109/TASLP.2018.2813011
Itakura KBando YNakamura EItoyama KYoshii KKawahara T(2018)Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial ModelsIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2017.278932026:4(831-846)Online publication date: 1-Apr-2018
https://dl.acm.org/doi/10.1109/TASLP.2017.2789320
Mitsui YTakamune NKitamura DSaruwatari HTakahashi YKondo K(2018)Vectorwise Coordinate Descent Algorithm for Spatially Regularized Independent Low-Rank Matrix Analysis2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2018.8462657(746-750)Online publication date: 15-Apr-2018
https://dl.acm.org/doi/10.1109/ICASSP.2018.8462657
Shimada KBando YMimura MItoyama KYoshii KKawahara T(2018)Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2018.8462642(5734-5738)Online publication date: 15-Apr-2018
https://dl.acm.org/doi/10.1109/ICASSP.2018.8462642
Show More Cited By

Abstract

Cited By

Recommendations

Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment

Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation

Under-determined reverberant audio source separation using local observed covariance and auditory-motivated time-frequency representation

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations