Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11238))

Included in the following conference series:

Ibero-American Conference on Artificial Intelligence

1318 Accesses

Abstract

Text-to-speech (TTS) is currently a mature technology used in many areas such as education and accessibility. Some modules of a TTS system depend on the language and, while there are many public materials for some languages (e.g., English and Japanese), the resources for Brazilian Portuguese (BP) are still limited. This work describes the development of a complete hidden Markov model (HMM) based TTS system for BP which can be applied to the desktop environment. It also releases a set of natural language processing tools for BP, which expands the already publicly available resources, supporting the development of new researches for academic or industrial purposes. Subjective and objective performance tests are presented, comparing the proposed TTS system with other softwares currently available for BP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Open Source Speech Synthesis Frontend for HTS

Recent Trends in Text to Speech Synthesis in Context with Indian Languages

An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System

Notes

1.
The syllable is a unit relatively easy to identify and segmental if the splitting rules stipulated by the language orthography are followed. However, as a phonological unit, there is no consensus about its basic structure, as discussed in [9]. For most authors, a syllable is defined so that its nucleus, canonically a vowel, constitutes a peak in the curve of audibility that is preceded (onset) and/or followed (coda) by a sequence of segments (none or more consonants), with progressively decreasing sonority values. The nucleus and coda are sometimes lumped together to form what is called the rhyme. By applying these principles, the syllable is a speech unit of rhythmic organization, although other authors disagree, stating that the syllable should not be seen in parts but as a whole.

References

Dicionário Online de Português. (2018). http://www.dicio.com.br/
Grupo falabrasil (2018). https://goo.gl/EWcfdg
HTS (2018). http://hts.sp.nitech.ac.jp/
HTS Engine (2018). http://hts-engine.sourceforge.net/
Alcaim, A., Solewicz, J.A., de Morais, J.A.: Frequência de ocorrência dos fones e listas de frases foneticamente balanceadas para o português falado no Rio de Janeiro. Revista da Sociedade Brasileira de Telecomunicacoes 7(1), 23–41 (1992)
Google Scholar
Braga, D., Coelho, L., Resende Jr., F.G.V.: A rule-based grapheme-to-phone converter for TTS systems in European Portuguese, pp. 141–156 (2007)
Google Scholar
Braga, D., Silva, P., Ribeiro, M., Dias, M.S., Campillo, F., Garc’a-Mateo, C.: Hélia, Heloisa and Helena: new HTS systems in European Portuguese, Brazilian Portuguese and Galician. In: International Conference on Computational Processing of the Portuguese Language, PROPOR 2010 (2010)
Google Scholar
Cirigliano, R.J.R., Monteiro, C., Barbosa, F.L., Resende Jr., F.G.V.R., Couto, L.R., de Morais, J.A.: Um conjunto de 1000 frases foneticamente balanceadas para o português brasileiro obtido utilizando e a abordagem de algoritmos genéticos. Anais do Simpósio Brasileiro de Telecomunicações (SBrT) (2005)
Google Scholar
Collischonn, G.: Introdução a Estudos de Fonologia do Português Brasileiro. Porto Alegre: EDIPUCRS, pp. 95–126 (2005)
Google Scholar
Costa, E., Monte, A., Neto, N., Klautau, A.: Um Framework para Desenvolvimento de Sistemas TTS Personalizados no Português do Brasil. In: XXX Simpósio Brasileiro de Telecomunicações (2012)
Google Scholar
Couto, I., Neto, N., Tadaiesky, V., Klautau, A., Maia, R.: An open source HMM-based text-to-speech system for Brazilian Portuguese. In: 7th International Telecommunications Symposium (2010)
Google Scholar
Dutoit, T., Pagel, V., Pierret, N., Bataille, F., Vrecken, O.V.D.: The MBROLA project: towards a set of high-quality speech synthesizers free of use for non-commercial purposes. In: Proceedings of ICSLP 1996, Philadelphia, vol. 3, pp. 1393–1396 (1996)
Google Scholar
Faria, A.: Applied Phonetics: Portuguese Text-to-Speech. Technical report, University of California (2003)
Google Scholar
Maciel, A., Carvalho, E.: Integration and evaluation of an HMM-based text-to-speech system to FIVE. In: 19th International Conference on Systems, Signals and Image Processing, IWSSIP 2012 (2012)
Google Scholar
Maia, R., Zen, H., Tokuda, K., Kitamura, T., Resende, F.: An HMM-based Brazilian Portuguese speech synthetiser and its characteristics. J. Commun. Inf. Syst. 21, 58–71 (2006)
Google Scholar
Monte, A., Ribeiro, D., Neto, N., Cruz, R., Klautau, A.: A rule-based syllabification algorithm with stress determination for Brazilian Portuguese natural language processing. In: 17th International Congress of Phonetic Sciences, pp. 1418–1421 (2011)
Google Scholar
Barbosa, P., et al.: Aiuruete: a high-quality concatenative text-to-speech system for Brazilian Portuguese with demisyllabic analysis-based units and hierarchical model of rhythm production. In: Proceedings of the Eurospeech 1999, pp. 2059–2062 (1999)
Google Scholar
Schröder, M., Trouvain, J.: The German text-to-speech synthesis system MARY: a tool for research, development and teaching. Int. J. Speech Technol. 6, 365–377 (2001)
Google Scholar
Silva, D., de Lima, A., Maia, R., Braga, D., de Moraes, J.F., de Moraes, J.A., Resende Jr., F.G.: A rule-based grapheme-phone converter and stress determination for Brazilian Portuguese natural language processing. In: VI International Telecommunications Symposium (2006)
Google Scholar
Silva, D.C., Braga, D., Resende Jr., F.G.V.: Separação das Silabas e Determinação da Tonicidade no Português Brasileiro. In: XXVI Simpósio Brasileiro de Telecomunicações, SBrT 2008 (2008)
Google Scholar
Siravenha, A., Neto, N., Macedo, V., Klautau, A.: Uso de Regras Fonológicas com de terminação de Vogal Tônica para Conversão Grafema-Fone em Português Brasileiro. In: 7th International Information and Telecommunication Technologies Symposium (2008)
Google Scholar
Souza, D., Saturnino, L., Maciel, A.: A portability evaluation of Brazilian Portuguese voice produced with MARY TTS. In: 2014 International Conference on Systems, Signals and Image Processing (IWSSIP) (2014)
Google Scholar
Taylor, P.: Text-To-Speech Synthesis. Cambridge University Press, Cambridge (2009)
Google Scholar
Turunen, M.: Speech application design and development. Technical report (2004)
Google Scholar
Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. In: Proceedings of EUROSPEECH, vol. 5, no. 98, pp. 2347–2350 (1999)
Google Scholar
Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51(11), 1039–1064 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Exact and Natural Sciences, Federal University of Pará, Augusto Correa. 1, Belém, PA, 66075-110, Brazil
Ericson Costa & Nelson Neto

Authors

Ericson Costa
View author publications
You can also search for this author in PubMed Google Scholar
Nelson Neto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ericson Costa or Nelson Neto .

Editor information

Editors and Affiliations

Universidad Nacional del Sur, Bahía Blanca, Buenos Aires, Argentina
Guillermo R. Simari
University of Madeira, Funchal, Portugal
Eduardo Fermé
Universidad Nacional de Piura, Castilla-Piura, Peru
Flabio Gutiérrez Segura
Universidad Nacional de Trujillo, Trujillo, Peru
José Antonio Rodríguez Melquiades

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Costa, E., Neto, N. (2018). Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis. In: Simari, G.R., Fermé, E., Gutiérrez Segura, F., Rodríguez Melquiades, J.A. (eds) Advances in Artificial Intelligence – IBERAMIA 2018. IBERAMIA 2018. Lecture Notes in Computer Science(), vol 11238. Springer, Cham. https://doi.org/10.1007/978-3-030-03928-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-03928-8_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03927-1
Online ISBN: 978-3-030-03928-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Open Source Speech Synthesis Frontend for HTS

Recent Trends in Text to Speech Synthesis in Context with Indian Languages

An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Open Source Speech Synthesis Frontend for HTS

Recent Trends in Text to Speech Synthesis in Context with Indian Languages

An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation