Discovering Convolutive Speech Phones Using Sparseness and Non-negativity

Paul D. O’Grady¹ &
Barak A. Pearlmutter²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4666))

Included in the following conference series:

International Conference on Independent Component Analysis and Signal Separation

3072 Accesses
9 Citations

Abstract

Discovering a representation that allows auditory data to be parsimoniously represented is useful for many machine learning and signal processing tasks. Such a representation can be constructed by Non-negative Matrix Factorisation (NMF), which is a method for finding parts-based representations of non-negative data. Here, we present a convolutive NMF algorithm that includes a sparseness constraint on the activations and has multiplicative updates. In combination with a spectral magnitude transform of speech, this method extracts speech phones that exhibit sparse activation patterns, which we use in a supervised separation scheme for monophonic mixtures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Separation of Known Sources Using Non-negative Spectrogram Factorisation

Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution

Supervised non-negative matrix factorization for audio source separation

References

Comon, P.: Independent component analysis: A new concept. Signal Processing 36, 287–314 (1994)
Article MATH Google Scholar
Paatero, P., Tapper, U.: Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)
Article Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Adv. in Neu. Info. Proc. Sys. 13, pp. 556–562. MIT Press, Cambridge (2001), URL: citeseer.ist.psu.edu/lee00algorithms.html
Google Scholar
Smaragdis, P.: Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. In: Puntonet, C.G., Prieto, A.G. (eds.) ICA 2004. LNCS, vol. 3195, pp. 494–499. Springer, Heidelberg (2004)
Google Scholar
Abdallah, S.A., Plumbley, M.D.: Polyphonic transcription by non-negative sparse coding of power spectra. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), pp. 318–325 (2004)
Google Scholar
O’Grady, P.D.: Sparse Separation of Under-Determined Speech Mixtures. PhD thesis, National University of Ireland Maynooth (2007), URL http://ee.ucd.ie/~pogrady/ogrady2007_phd.pdf
Eggert, J., Körner, E.: Sparse coding and NMF. In: IEEE International Joint Conference on Neural Networks, Proceedings, July 2004, vol. 4, pp. 2529–2533. IEEE, Los Alamitos (2004)
Google Scholar
Smaragdis, P.: Convolutive speech bases and their application to supervised speech separation. IEEE Transaction on Audio, Speech and Language Processing (2007)
Google Scholar
Févotte, C., Gribonval, R., Vincent, E.: BSS_EVAL toolbox user guide. Technical Report 1706, IRISA (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Complex & Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland
Paul D. O’Grady
Hamilton Institute, National University of Ireland Maynooth, Co. Kildare, Ireland
Barak A. Pearlmutter

Authors

Paul D. O’Grady
View author publications
You can also search for this author in PubMed Google Scholar
Barak A. Pearlmutter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Mike E. Davies Christopher J. James Samer A. Abdallah Mark D Plumbley

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

O’Grady, P.D., Pearlmutter, B.A. (2007). Discovering Convolutive Speech Phones Using Sparseness and Non-negativity. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds) Independent Component Analysis and Signal Separation. ICA 2007. Lecture Notes in Computer Science, vol 4666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74494-8_65

Download citation

DOI: https://doi.org/10.1007/978-3-540-74494-8_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74493-1
Online ISBN: 978-3-540-74494-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discovering Convolutive Speech Phones Using Sparseness and Non-negativity

Abstract

Access this chapter

Preview

Similar content being viewed by others

Separation of Known Sources Using Non-negative Spectrogram Factorisation

Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution

Supervised non-negative matrix factorization for audio source separation

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Discovering Convolutive Speech Phones Using Sparseness and Non-negativity

Abstract

Access this chapter

Preview

Similar content being viewed by others

Separation of Known Sources Using Non-negative Spectrogram Factorisation

Speaker Verification Using Adaptive Dictionaries in Non-negative Spectrogram Deconvolution

Supervised non-negative matrix factorization for audio source separation

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation