Abstract
This paper gives a survey of the current state of ARTIC – the modern Czech concatenative corpus-based text-to-speech system. All stages of the system design are described in the paper, including the acoustic unit inventory building process, text processing and speech production issues. Two versions of the system are presented: the single unit instance system with the moderate output speech quality, suitable for low-resource devices, and the multiple unit instance system with a dynamic unit instance selection scheme, yielding the output speech of a high quality. Both versions make use of the automatically designed acoustic unit inventories. In order to assure the desired prosodic characteristics of the output speech, system-version-specific prosody generation issues are discussed here too. Although the system was primarily designed for synthesis of Czech speech, ARTIC can now speak three languages: Czech (both female and male voices are available), Slovak and German.
Support for this work was provided by the Academy of Sciences of the Czech Republic, project No. 1ET101470416, and the Grant Agency of the Czech Republic, project No. 102/05/0278.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Matoušek, J., Romportl, J., Tihelka, D., Tychtl, Z.: Recent Improvements on ARTIC: Czech Text-to-Speech System. In: Proc. ICSLP, Jeju Island, Korea, vol. III, pp. 1933–1936 (2004)
Matoušek, J., Psutka, J., Krůta, J.: On Building Speech Corpus for Concatenation-Based Speech Synthesis. In: Proc. Eurospeech, Ålborg, Denmark, vol. 3, pp. 2047–2050 (2001)
Matoušek, J., Kala, J.: On Modelling Glottal Stop in Czech Text-to-Speech Synthesis. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 257–264. Springer, Heidelberg (2005)
Czech SAMPA, http://www.phon.ucl.ac.uk/home/sampa/czech-uni.htm
Matoušek, J., Hanzlíček, Z., Tihelka, D.: Hybrid Syllable/Triphone Speech Synthesis. In: Proc. Interspeech, Lisboa, Portugal, pp. 2529–2532 (2005)
Matoušek, J., Tihelka, D., Psutka, J.: Automatic Segmentation for Czech Concatenative Speech Synthesis Using Statistical Approach with Boundary-Specific Correction. In: Proc. Eurospeech, Geneva, pp. 301–304 (2003)
Matoušek, J., Tihelka, D., Psutka, J.: Experiments with Automatic Segmentation for Czech Speech Synthesis. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 287–294. Springer, Heidelberg (2003)
Kanis, J., Zelinka, J., Müller, L.: Automatic Numbers Normalization in Inflectional Languages. In: Proc. SPECOM, Moscow, pp. 663–666 (2005)
Donovan, R.E., Woodland, P.C.: A Hidden Markov-Model-Based Trainable Speech Synthesizer. Computer Speech and Language 13, 223–241 (1999)
Tychtl, Z.: Phase-Mismatch-Free and Data Efficient Approach to Natural Sounding Harmonic Concatenative Speech Synthesis. In: Proc. EUSIPCO, Wien, Austria, pp. 1027–1030 (2004)
Romportl, J., Matoušek, J.: Formal Prosodic Structures and their Application in NLP. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 371–378. Springer, Heidelberg (2005)
Romportl, J.: Structural Data-Driven Prosody Model for TTS Synthesis. In: Proc. Speech Prosody, Dresden, Germany, vol. II, pp. 549–552 (2006)
Tihelka, D.: Symbolic Prosody Driven Unit Selection for Highly Natural Synthetic Speech. In: Proc. Eurospeech, Lisbon, pp. 2525–2528 (2005)
Tihelka, D., Matoušek, J.: The Analysis of Synthetic Speech Distortions. In: Proc. Czech-German Workshop on Speech Processing, Czech Academy of Sciences, Prague, pp. 124–129 (2004)
Matoušek, J., Tihelka, D.: Slovak Text-to-Speech Synthesis in ARTIC System. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 155–162. Springer, Heidelberg (2004)
Matoušek, J., Tihelka, D., Psutka, J., Hesová, J.: German and Czech Speech Synthesis using HMM-Based Speech Segment Database. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 173–180. Springer, Heidelberg (2002)
Krňoul, Z., Železný, M.: Realistic Face Animation for a Czech Talking Head. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 603–610. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matoušek, J., Tihelka, D., Romportl, J. (2006). Current State of Czech Text-to-Speech System ARTIC. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_55
Download citation
DOI: https://doi.org/10.1007/11846406_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)