Abstract
This paper gives an overview of the design concepts and implementation of a Hungarian microblog reading system. Speech synthesis of such special text requires some special components. First, an efficient diacritic reconstruction algorithm was applied. The accuracy of a former dictionary-based method was improved by machine learning to handle ambiguous cases properly. Second, an unlimited domain text-to-speech synthesizer was applied with extensions for emotional and spontaneous styles. Chat or blog texts often contain ”emoticons” which mark the emotional state of the user. Therefore, an expressive speech synthesis method was adapted to a corpus-based synthesizer. Four emotions were generated and evaluated in a listening test: neutral, happy, angry and sad. The results of the experiments showed that happy and sad emotions can be generated with this algorithm, with best accuracy for female voice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
X-Chat Text-To-Speech, https://launchpad.net/xctts
Mihalcea, R., Nastase, V.: Letter Level Learning for Language Independent Diacrtitics Restoration. In: COLING 2002, Taipei, Taiwan, pp. 1–7 (2002)
Galicia-Haro, S.N., Bolshakov, I.A., Gelbukh, A.F.: A Simple Spanish Part of Speech Tagger for Detection and Correction of Accentuation Error. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 219–222. Springer, Heidelberg (1999)
Simard, M.: Automatic Insertion of Accents in French Text. In: Proc. of Conf. EMNLP, Granada, Spain, pp. 27–35 (1998)
Németh, G., Zainkó, C., Fekete, L., Olaszy, G., Endrédi, G., Olaszi, P., Kiss, G., Kiss, P.: The Design, Implementation and Operation of a Hungarian E-Mail Reader. Int. Journ. Of Speech Techn. 3-4, 216–228 (2000)
Hungarian National Corpus, http://corpus.nytud.hu/mnsz
Witten, I.H., Frank, E.: Using the J4.8 Decision Tree, Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)
Carlson, R., Gustafson, K., Strangert, E.: Modelling Hesitation for Synthesis of Spontaneous Speech. In: Proc. of Speech Prosody, Dresden, pp. 69–72 (2006)
Fék, M., Pesti, P., Németh, G., Zainkó, Cs., Olaszy, G.: Corpus-Based Unit Selection TTS for Hungarian. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 367–374. Springer, Heidelberg (2006)
Csapó, T.G., Zainkó, Cs., Németh, G.: A Study of Prosodic Variability Methods in a Corpus-Based Unit Selection Text-To-Speech System. Infocommunications Journal LXV, 32–37 (2010)
Bulut, M., Narayanan, S. S., Syrdal, A.K.: Expressive Speech Synthesis Using a Concatenative Synthesizer. In: Proc. ICSLP, pp. 1265–1268 (2002)
Přibilová, A., Přibil, J.: Spectrum Modification for Emotional Speech Synthesis. Multimodal Signals: Cognitive and Algorithmic Issues, 232–241 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zainkó, C., Csapó, T.G., Németh, G. (2010). Special Speech Synthesis for Social Network Websites. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_58
Download citation
DOI: https://doi.org/10.1007/978-3-642-15760-8_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)