[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Optimal Utterance Selection for Unit Selection Speech Synthesis Databases

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper describes techniques to find an optimal data set for building high quality unit-selection speech synthesis inventories. As the quality of unit-selection speech synthesis is dependent on the coverage of the database used in the selection, it is important to select the right data to record. In this paper we describe some simple techniques as well as a more complex acoustic modeling technique based on the database speaker's acoustic characteristics. Result of a simple evaluation procedure are presented justifying the technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Black, A.W. and Lenzo, K.A. (2000a). Limited domain synthesis. International Conference on Spoken Language Processing, ICSLP 2000. Beijing, China, vol. II, pp. 411-414.

    Google Scholar 

  • Black, A.W. and Lenzo, K.A. (2000b). Building voices in the Festival speech synthesis system. http://festvox.org/festvox/.

  • Black, A.W. and Taylor, P. (1997). Automatically clustering similar units for unit selection in speech synthesis. Proceedings of Eurospeech'97. Rhodes, Greece, vol. 2, pp. 601-604.

    Google Scholar 

  • Black, A.W., Taylor, P., and Caley, R. (1998). The Festival speech synthesis system. http://festvox.org/festival.

  • Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees. Pacific Grove, CA: Wadsworth & Brooks.

    Google Scholar 

  • Carroll, L. (1865). Alice's Adventures in Wonderland. London, UK: Macmillan.

    Google Scholar 

  • Donovan, R. and Woodland, P. (1995). Improvements in an HMMbased speech synthesiser. Proceedings of Eurospeech'95. Madrid, Spain, vol. 1, pp. 573-576.

    Google Scholar 

  • Fisher, W., Doddington, G., and Goudie-Marshall, K. (1986). The DARPA speech recognition research database: Specifications and status. Proceedings of the DARPA Workshop on Speech Recognition. Palo Alto, CA. pp. 93-99.

  • Fujimura, O. (1994). C/D model: A computational model of phonetic implementation. In E.S. Ristad (Ed.), Langauge Computations, Volume 17 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science. Providence, RI: American Mathematical Society, pp. 1-20.

    Google Scholar 

  • Hart, M. (2000). Project Gutenberg. http://promo.net/pg/.

  • Hunt, A. and Black, A. (1996). Unit selection in a concatenative speech synthesis system using a large speech database. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'96. Atlanta, Georgia, vol. 1, pp. 373-376.

    Google Scholar 

  • Lenzo, K. and Black, A. (2000). Diphone collection and synthesis. International Conference on Spoken Language Processing, ICSLP 2000. Beijing, China, vol. III, pp. 306-309.

    Google Scholar 

  • Rudnicky, A., Bennett, C., Black, A., Chotimongkol, A., Lenzo, K., Oh, A., and Singh, R. (2000). Task and domain specific modelling in the Carnegie Mellon Communicator system. International Conference on Spoken Language Processing, ICSLP 2000, Beijing, China, vol. II, pp. 130-133.

    Google Scholar 

  • van Santen, J. and Buchsbaum, A. (1997). Methods for optimal text selection. Proceedings of Eurospeech'97. Rhodes, Greece, vol. 2, pp. 553-556.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Black, A.W., Lenzo, K. Optimal Utterance Selection for Unit Selection Speech Synthesis Databases. International Journal of Speech Technology 6, 357–363 (2003). https://doi.org/10.1023/A:1025704800086

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1025704800086

Navigation