Abstract
Nonnegative matrix factorization (NMF) has been well-known as a powerful spectral model for audio signals. Existing work, including ours, has investigated the use of generic source spectral models (GSSM) based on NMF for single-channel audio source separation and shown its efficiency in different settings. This paper extends the work to multichannel case where the GSSM is combined with the source spatial covariance model within a unified Gaussian modeling framework. Especially, unlike a conventional combination where the estimated variances of each source are further constrained by NMF separately, we propose to constrain the total variances of all sources altogether and found a better separation performance. We present the expectation-maximization (EM) algorithm for the parameter estimation. We demonstrate the effectiveness of the proposed approach by using a benchmark dataset provided within the 2016 Signal Separation Evaluation Campaign.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liutkus, A., Stöter, F.-R., Rafii, Z., Kitamura, D., Rivet, B., Ito, N., Ono, N., Fontecave, J.: The 2016 signal separation evaluation campaign. In: Tichavský, P., Babaie-Zadeh, M., Michel, O.J.J., Thirion-Moreau, N. (eds.) LVA/ICA 2017. LNCS, vol. 10169, pp. 323–332. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53547-0_31
Liutkus, A., Durrieu, J.L., Daudet, L., Richard, G.: An overview of informed audio source separation. In: International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), pp. 1–4. IEEE (2013)
Ewert, S., Pardo, B., Mueller, M., Plumbley, M.D.: Score-informed source separation for musical audio recordings: an overview. IEEE Sig. Process. Mag. 31(3), 116–124 (2014)
Magoarou, L.L., Ozerov, A., Duong, N.Q.K.: Text-informed audio source separation. example-based approach using non-negative matrix partial co-factorization. J. Sig. Process. Syst. 79(2), 117–131 (2015)
Parekh, S., Essid, S., Ozerov, A., Duong, N.Q.K., Perez, P., Richard, G.: Motion informed audio source separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017)
Souviraà-Labastie, N., Olivero, A., Vincent, E., Bimbot, F.: Multi-channel audio source separation using multiple deformed references. IEEE/ACM Trans. Audio Speech Lang. Process. 23, 1775–1787 (2015)
Sun, D.L., Mysore, G.J.: Universal speech models for speaker independent single channel source separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 141–145 (2013)
Badawy, D.E., Duong, N.Q.K., Ozerov, A.: On-the-fly audio source separation - a novel user-friendly framework. IEEE/ACM Trans. Audio Speech Lang. Process. 25(2), 261–272 (2017)
Duong, H.T.T., Nguyen, Q.C., Nguyen, C.P., Tran, T.H., Duong, N.Q.K.: Speech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint. In: Proceedings of the ACM SoICT, pp. 247–251 (2015)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural and Information Processing Systems 13, pp. 556–562 (2001)
Févotte, C., Bertin, N., Durrieu, J.L.: Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Comput. 21(3), 793–830 (2009)
Mandel, M., Ellis, D.: EM localization and separation using interaural level and phase cues. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 275–278 (2007)
Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. Audio Speech Lang. Process. 19(3), 516–527 (2011)
Kitamura, D., Ono, N., Sawada, H., Kameoka, H., Saruwatari, H.: Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 276–280 (2015)
Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)
Fakhry, M., Svaizer, P., Omologo, M.: Audio source separation in reverberant environments using beta-divergence based nonnegative factorization. IEEE/ACM Trans. Audio Speech Lang. Process. 25(7), 1462–1476 (2017)
Arberet, S., Ozerov, A., Duong, N.Q.K., Vincent, E., Gribonval, R., Vandergheynst, P.: Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation. In: Proceedings of the IEEE ISSPA, pp. 1–4 (2010)
Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)
Lefèvre, A., Bach, F., Févotte, C.: Itakura-Saito non-negative matrix factorization with group sparsity. In: Proceedings of the IEEE ICASSP, pp. 21–24 (2011)
Wood, S., Rouat, J.: Blind speech separation with GCC-NMF. In: Proceedings of the Interspeech, pp. 3329–3333 (2016)
Vincent, E., Gribonval, R., Fevotte, C.: Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Duong, T.T.H., Duong, N.Q.K., Nguyen, CP., Nguyen, QC. (2018). Multichannel Audio Source Separation Exploiting NMF-Based Generic Source Spectral Model in Gaussian Modeling Framework. In: Deville, Y., Gannot, S., Mason, R., Plumbley, M., Ward, D. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2018. Lecture Notes in Computer Science(), vol 10891. Springer, Cham. https://doi.org/10.1007/978-3-319-93764-9_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-93764-9_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93763-2
Online ISBN: 978-3-319-93764-9
eBook Packages: Computer ScienceComputer Science (R0)