Research Article
Open access
Published: 01 December 2006

Speech Source Separation in Convolutive Environments Using Space-Time-Frequency Analysis

Shlomo Dubnov¹,
Joseph Tabrikian² &
Miki Arnon-Targan²

EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 038412 (2006) Cite this article

1084 Accesses
7 Citations
Metrics details

Abstract

We propose a new method for speech source separation that is based on directionally-disjoint estimation of the transfer functions between microphones and sources at different frequencies and at multiple times. The spatial transfer functions are estimated from eigenvectors of the microphones' correlation matrix. Smoothing and association of transfer function parameters across different frequencies are performed by simultaneous extended Kalman filtering of the amplitude and phase estimates. This approach allows transfer function estimation even if the number of sources is greater than the number of microphones, and it can operate for both wideband and narrowband sources. The performance of the proposed method was studied via simulations and the results show good performance.

References

Torkkola K: Blind separation for audio signals—are we there yet? Proceedings of 1st International Workshop on Independent Component Analysis and Blind Signal Separation (ICA~'99), January 1999, Aussois, France 239–244.
Google Scholar
Parra L, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000, 8(3):320–327. 10.1109/89.841214
Article Google Scholar
Jourjine A, Rickard S, Yilmaz O: Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 2985–2988.
Google Scholar
Roman N, Wang DL, Brown GJ: Speech segregation based on sound localization. The Journal of the Acoustical Society of America 2003, 114(4):2236–2252. 10.1121/1.1610463
Article Google Scholar
Fevotte C, Doncarli C: Two contributions to blind source separation using time-frequency distributions. IEEE Signal Processing Letters 2004, 11(3):386–389. 10.1109/LSP.2003.819343
Article Google Scholar
Deville Y: Temporal and time-frequency correlation-based blind source separation methods. Proceedings of 4th International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 1059–1064.
Google Scholar
Ikram MZ, Morgan DR: Permutation inconsistency in blind speech separation: investigation and solutions. IEEE Transactions on Speech and Audio Processing 2005, 13(1):1–13.
Article Google Scholar
Yilmaz O, Rickard S: Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing 2004, 52(7):1830–1847. 10.1109/TSP.2004.828896
Article MathSciNet Google Scholar
Steinhardt A: Adaptive multisensor detection and estimation. In Adaptive Radar Detection and Estimation. Edited by: Haykin S, Steinhardt A. John Wiley & Sons, New York, NY, USA; 1992:91–160.
Google Scholar
Schobben DWE, Torkkola K, Smaragdis P: Evaluation of blind signal separation methods. Proceedings of 1st International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '99), January 1999, Aussois, France
Google Scholar

Download references

Author information

Authors and Affiliations

CALIT 2, University of California, San Diego, CA, 92093, USA
Shlomo Dubnov
Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
Joseph Tabrikian & Miki Arnon-Targan

Authors

Shlomo Dubnov
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Tabrikian
View author publications
You can also search for this author in PubMed Google Scholar
Miki Arnon-Targan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Dubnov, S., Tabrikian, J. & Arnon-Targan, M. Speech Source Separation in Convolutive Environments Using Space-Time-Frequency Analysis. EURASIP J. Adv. Signal Process. 2006, 038412 (2006). https://doi.org/10.1155/ASP/2006/38412

Download citation

Received: 10 February 2005
Revised: 28 September 2005
Accepted: 04 October 2005
Published: 01 December 2006
DOI: https://doi.org/10.1155/ASP/2006/38412