Abstract
Discovering a representation that allows auditory data to be parsimoniously represented is useful for many machine learning and signal processing tasks. Such a representation can be constructed by Non-negative Matrix Factorisation (NMF), which is a method for finding parts-based representations of non-negative data. Here, we present a convolutive NMF algorithm that includes a sparseness constraint on the activations and has multiplicative updates. In combination with a spectral magnitude transform of speech, this method extracts speech phones that exhibit sparse activation patterns, which we use in a supervised separation scheme for monophonic mixtures.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Comon, P.: Independent component analysis: A new concept. Signal Processing 36, 287–314 (1994)
Paatero, P., Tapper, U.: Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Adv. in Neu. Info. Proc. Sys. 13, pp. 556–562. MIT Press, Cambridge (2001), URL: citeseer.ist.psu.edu/lee00algorithms.html
Smaragdis, P.: Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs. In: Puntonet, C.G., Prieto, A.G. (eds.) ICA 2004. LNCS, vol. 3195, pp. 494–499. Springer, Heidelberg (2004)
Abdallah, S.A., Plumbley, M.D.: Polyphonic transcription by non-negative sparse coding of power spectra. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), pp. 318–325 (2004)
O’Grady, P.D.: Sparse Separation of Under-Determined Speech Mixtures. PhD thesis, National University of Ireland Maynooth (2007), URL http://ee.ucd.ie/~pogrady/ogrady2007_phd.pdf
Eggert, J., Körner, E.: Sparse coding and NMF. In: IEEE International Joint Conference on Neural Networks, Proceedings, July 2004, vol. 4, pp. 2529–2533. IEEE, Los Alamitos (2004)
Smaragdis, P.: Convolutive speech bases and their application to supervised speech separation. IEEE Transaction on Audio, Speech and Language Processing (2007)
Févotte, C., Gribonval, R., Vincent, E.: BSS_EVAL toolbox user guide. Technical Report 1706, IRISA (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
O’Grady, P.D., Pearlmutter, B.A. (2007). Discovering Convolutive Speech Phones Using Sparseness and Non-negativity. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds) Independent Component Analysis and Signal Separation. ICA 2007. Lecture Notes in Computer Science, vol 4666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74494-8_65
Download citation
DOI: https://doi.org/10.1007/978-3-540-74494-8_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74493-1
Online ISBN: 978-3-540-74494-8
eBook Packages: Computer ScienceComputer Science (R0)