A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping

Murad A. Rassam^17,18,
Rasha Almekhlafi¹⁸,
Eman Alosaily¹⁸,
Haneen Hassan¹⁸,
Reem Hassan¹⁸,
Eman Saeed¹⁸ &
…
Elham Alqershi¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1073))

Included in the following conference series:

International Conference of Reliable Information and Communication Technology

1682 Accesses

Abstract

In this paper, a new model for voice morphing is proposed. The spectral characteristics of a source speaker’s speech have been transferred to speech as it was spoken by another designated target speaker. The proposed model performs a phoneme segmentation of the voice signal and then transforms the spectral characteristics of each segment using a Linear Prediction model. The spectral features extracted using the Linear Prediction Coding (LPC) technique are aligned using the Dynamic Time Wrapping (DTW). The Generative Topographic Mapping (GTM) method was used for modeling the LPC features. Then, the transformation is achieved using the Gaussian Mixture Model (GMM). The transformed code-books are finally converted to prediction coefficients, and the excitation signal is filtered in order to synthesis the speech. A correlation test is performed between the source, and target signals showed a high correlation. The results reveal that the proposed model is promising in terms of recognizing full sentences in addition to individual words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 143.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 179.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

Article 25 July 2024

A Novel and Intelligent Approach for Indian Locale Based Text-to-Speech Model by Hybridizing Wave Net and Wave Glow with Mel-Spectrogram Analysis

Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis

References

Hutchinson, M.: Methods for voice conversion (2012)
Google Scholar
Saundade, M., Kurle, P.: Speech recognition using digital signal processing. Int. J. Electron. Commun. Soft Comput. Sci. Eng. 2, 31 (2013)
Google Scholar
Orphanidou, C., et al.: Voice morphing using the generative topographic mapping (2003)
Google Scholar
Kain, A., Macon, M.W.: Spectral voice conversion for text-to-speech synthesis (1998)
Google Scholar
Mccree, A.: Low-Bit-Rate Speech Coding. Information Systems Technology Group, MIT Lincoln Laboratory (2008)
Google Scholar
Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice conversion through vector quantization. In: Proceedings of IEEE ICASSP (1988)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall Signal Processing Series (1978)
Google Scholar
Drioli, C.: Radial basis function networks for conversion of sound spectra. EURASIP J. Appl. Signal Process. 2001, 36–44 (2001)
Google Scholar
Orphanidou, C., Moroz, I.M., Roberts, S.J.: Wavelet-based voice morphing (2004)
Google Scholar
Garofolo, J.S.: TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Linguistic Data Consortium, Philadelphia (1993)
Google Scholar
Songar, A., Harita, M.B.: MATLAB based voice conversion model using PSOLA algorithm. Int. J. Digit. Appl. Contemp. Res. 1, 2319–4863 (2013)
Google Scholar
Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 64, 561–580 (1975)
Article Google Scholar
Hosom, J.-P.: Automatic time alignment of phonemes using acoustic-phonetic information, May 2000
Google Scholar
Markus, J.F.: GTM: the generative topographic mapping, April 1998
Google Scholar
Netlab Toolbox. http://www1.aston.ac.uk/eas/research/groups/ncrg/resources/netlab/

Download references

Author information

Authors and Affiliations

Information Technology Department, College of Computer, Qassim University, Buraidah, Kingdom of Saudi Arabia
Murad A. Rassam
Faculty of Engineering and Information Technology, Taiz University, 6803, Taiz, Yemen
Murad A. Rassam, Rasha Almekhlafi, Eman Alosaily, Haneen Hassan, Reem Hassan, Eman Saeed & Elham Alqershi

Authors

Murad A. Rassam
View author publications
You can also search for this author in PubMed Google Scholar
Rasha Almekhlafi
View author publications
You can also search for this author in PubMed Google Scholar
Eman Alosaily
View author publications
You can also search for this author in PubMed Google Scholar
Haneen Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Reem Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Eman Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Elham Alqershi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Murad A. Rassam .

Editor information

Editors and Affiliations

College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
Faisal Saeed
School of Computing, Universiti Utara Malaysia (UUM), Sintok, Kedah Darul Aman, Malaysia
Fathey Mohammed
Management of Information Systems Department College of Business Administration, Taibah University, Yanbu, Saudi Arabia
Nadhmi Gazem

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rassam, M.A. et al. (2020). A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping. In: Saeed, F., Mohammed, F., Gazem, N. (eds) Emerging Trends in Intelligent Computing and Informatics. IRICT 2019. Advances in Intelligent Systems and Computing, vol 1073. Springer, Cham. https://doi.org/10.1007/978-3-030-33582-3_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-33582-3_38
Published: 02 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33581-6
Online ISBN: 978-3-030-33582-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

A Novel and Intelligent Approach for Indian Locale Based Text-to-Speech Model by Hybridizing Wave Net and Wave Glow with Mel-Spectrogram Analysis

Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

A Novel and Intelligent Approach for Indian Locale Based Text-to-Speech Model by Hybridizing Wave Net and Wave Glow with Mel-Spectrogram Analysis

Measuring the Effect of Reverberation on Statistical Parametric Speech Synthesis

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation