[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Multichannel Audio Source Separation Exploiting NMF-Based Generic Source Spectral Model in Gaussian Modeling Framework

  • Conference paper
  • First Online:
Latent Variable Analysis and Signal Separation (LVA/ICA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10891))

Abstract

Nonnegative matrix factorization (NMF) has been well-known as a powerful spectral model for audio signals. Existing work, including ours, has investigated the use of generic source spectral models (GSSM) based on NMF for single-channel audio source separation and shown its efficiency in different settings. This paper extends the work to multichannel case where the GSSM is combined with the source spatial covariance model within a unified Gaussian modeling framework. Especially, unlike a conventional combination where the estimated variances of each source are further constrained by NMF separately, we propose to constrain the total variances of all sources altogether and found a better separation performance. We present the expectation-maximization (EM) algorithm for the parameter estimation. We demonstrate the effectiveness of the proposed approach by using a benchmark dataset provided within the 2016 Signal Separation Evaluation Campaign.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://sisec.inria.fr/sisec-2016/bgn-2016/.

  2. 2.

    https://sisec.inria.fr/sisec-2015/2015-underdetermined-speech-and-music-mixtures/.

  3. 3.

    http://parole.loria.fr/DEMAND/.

References

  1. Liutkus, A., Stöter, F.-R., Rafii, Z., Kitamura, D., Rivet, B., Ito, N., Ono, N., Fontecave, J.: The 2016 signal separation evaluation campaign. In: Tichavský, P., Babaie-Zadeh, M., Michel, O.J.J., Thirion-Moreau, N. (eds.) LVA/ICA 2017. LNCS, vol. 10169, pp. 323–332. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53547-0_31

    Chapter  Google Scholar 

  2. Liutkus, A., Durrieu, J.L., Daudet, L., Richard, G.: An overview of informed audio source separation. In: International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), pp. 1–4. IEEE (2013)

    Google Scholar 

  3. Ewert, S., Pardo, B., Mueller, M., Plumbley, M.D.: Score-informed source separation for musical audio recordings: an overview. IEEE Sig. Process. Mag. 31(3), 116–124 (2014)

    Article  Google Scholar 

  4. Magoarou, L.L., Ozerov, A., Duong, N.Q.K.: Text-informed audio source separation. example-based approach using non-negative matrix partial co-factorization. J. Sig. Process. Syst. 79(2), 117–131 (2015)

    Article  Google Scholar 

  5. Parekh, S., Essid, S., Ozerov, A., Duong, N.Q.K., Perez, P., Richard, G.: Motion informed audio source separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017)

    Google Scholar 

  6. Souviraà-Labastie, N., Olivero, A., Vincent, E., Bimbot, F.: Multi-channel audio source separation using multiple deformed references. IEEE/ACM Trans. Audio Speech Lang. Process. 23, 1775–1787 (2015)

    Article  Google Scholar 

  7. Sun, D.L., Mysore, G.J.: Universal speech models for speaker independent single channel source separation. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 141–145 (2013)

    Google Scholar 

  8. Badawy, D.E., Duong, N.Q.K., Ozerov, A.: On-the-fly audio source separation - a novel user-friendly framework. IEEE/ACM Trans. Audio Speech Lang. Process. 25(2), 261–272 (2017)

    Article  Google Scholar 

  9. Duong, H.T.T., Nguyen, Q.C., Nguyen, C.P., Tran, T.H., Duong, N.Q.K.: Speech enhancement based on nonnegative matrix factorization with mixed group sparsity constraint. In: Proceedings of the ACM SoICT, pp. 247–251 (2015)

    Google Scholar 

  10. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural and Information Processing Systems 13, pp. 556–562 (2001)

    Google Scholar 

  11. Févotte, C., Bertin, N., Durrieu, J.L.: Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Comput. 21(3), 793–830 (2009)

    Article  Google Scholar 

  12. Mandel, M., Ellis, D.: EM localization and separation using interaural level and phase cues. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 275–278 (2007)

    Google Scholar 

  13. Sawada, H., Araki, S., Makino, S.: Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans. Audio Speech Lang. Process. 19(3), 516–527 (2011)

    Article  Google Scholar 

  14. Kitamura, D., Ono, N., Sawada, H., Kameoka, H., Saruwatari, H.: Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 276–280 (2015)

    Google Scholar 

  15. Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)

    Article  Google Scholar 

  16. Fakhry, M., Svaizer, P., Omologo, M.: Audio source separation in reverberant environments using beta-divergence based nonnegative factorization. IEEE/ACM Trans. Audio Speech Lang. Process. 25(7), 1462–1476 (2017)

    Article  Google Scholar 

  17. Arberet, S., Ozerov, A., Duong, N.Q.K., Vincent, E., Gribonval, R., Vandergheynst, P.: Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation. In: Proceedings of the IEEE ISSPA, pp. 1–4 (2010)

    Google Scholar 

  18. Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)

    Article  Google Scholar 

  19. Lefèvre, A., Bach, F., Févotte, C.: Itakura-Saito non-negative matrix factorization with group sparsity. In: Proceedings of the IEEE ICASSP, pp. 21–24 (2011)

    Google Scholar 

  20. Wood, S., Rouat, J.: Blind speech separation with GCC-NMF. In: Proceedings of the Interspeech, pp. 3329–3333 (2016)

    Google Scholar 

  21. Vincent, E., Gribonval, R., Fevotte, C.: Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh Thi Hien Duong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Duong, T.T.H., Duong, N.Q.K., Nguyen, CP., Nguyen, QC. (2018). Multichannel Audio Source Separation Exploiting NMF-Based Generic Source Spectral Model in Gaussian Modeling Framework. In: Deville, Y., Gannot, S., Mason, R., Plumbley, M., Ward, D. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2018. Lecture Notes in Computer Science(), vol 10891. Springer, Cham. https://doi.org/10.1007/978-3-319-93764-9_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93764-9_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93763-2

  • Online ISBN: 978-3-319-93764-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics