[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Prosodic Phrasing: Machine and Human Evaluation

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper describes a set of experiments aiming at the construction and evaluation of a new phrasing module for European Portuguese text-to-speech synthesis, using classification and regression trees learned from hand-labelled texts. Using the assessment criteria of matching boundary predictions against the corresponding labelled ones, the best solution achieves an overall performance of 91.9%, with 86.3% of correctly assigned breaks and 4.3% of false insertions. Although in absolute terms such scores may be considered surprisingly good given the size of the training set, the total number of exact matches at the sentence level is much lower (22%). This suggested a more formal experiment to test the acceptability of the predicted phrasing in the judgement of human evaluators. As the model was not trained on a labelled speech corpus but on hand-labelled texts, the reference phrasing needed also to be assessed. The evaluation experiment involved 90 participants who were asked to grade both the automatic and the reference phrasings, and also to express their opinion on where the breaks should be placed. As expected, the results showed a large variability among the subjects in their acceptance of a specific sentence partition, and criteria had to be defined to summarise the data from the different evaluators. With the adopted criteria, the performance of the automatic assignment procedure at the sentence level is better rated by human evaluators than by simple matching with the reference corpus (78% vs. 22%, respectively).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Bachenko, J. and Fitzpatrick, E. (1990). A computational grammar of discourse-neutral prosodic phrasing in English. Computational Linguistics, 16(3):155-170.

    Google Scholar 

  • Beckman, M.E. and Elam, G.A. (1997). Guidelines for ToBI Labeling. Guidelines Version 3.0. Cleveland, OH: Ohio State University Research Foundation.

    Google Scholar 

  • Beckman, M.E. and Hirschberg, J. (1994). The ToBI Annotation Conventions. Appendix A. Cleveland, OH: Ohio State University Research Foundation.

    Google Scholar 

  • Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). Classification and Regression Trees. Pacific Grove, CA: Wadsworth and Brooks.

    Google Scholar 

  • Gee, J.P. and Grosjean, F. (1983). Performance structure: A psycholinguistic and linguistic appraisal. Cognitive Psychology, 15:411-458.

    Google Scholar 

  • Gussenhoven, C. (1988). Intonational phrasing and the prosodic hierarchy. In W.U. Dressler, H.C. Luschutzky, O.E. Pfeiffer, and R. Rennison (Eds.), Phonologica 1988. Cambridge University Press, pp. 89-99.

  • Hirschberg, J. and Prieto, P. (1996). Training intonational phrasing rules automatically for English and Spanish text-to-speech. Speech Communication, 18:281-290.

    Google Scholar 

  • Huang, X., Acero, A., and Hon, H. (2001). Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Englewood Cliff, NJ: Prentice Hall.

    Google Scholar 

  • Ladd, D.R. (1996). Intonational Phonology. Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • Nespor, M. and Vogel, I. (1986). Prosodic Phonology. Dordrecht, The Netherlands: Foris Publications.

    Google Scholar 

  • Oliveira, L.C., Viana, M.C., and Trancoso, I.M. (1991). DIXI-Portuguese text-to-speech system. Proc. of the European Conference on Speech Technology. Genoa, Italy, pp. 1239-1242.

  • Pierrehumbert, J. (1980). The Phonology and Phonetics of English Intonation, PhD thesis, Massachusetts Institute of Technology, Cambridge, MA.

    Google Scholar 

  • Pierrehumbert, J. and Beckman, M. (1988). Japanese Tone Structure. Cambridge, MA: MIT Press.

    Google Scholar 

  • Selkirk, E. (1984). Phonology and Syntax: The Relation between Sound and Structure. Cambridge, MA: MIT Press.

    Google Scholar 

  • Selkirk, E. (1986). On derived domains in sentence prosody. In C.J. Ewen and J.M. Anderson (Eds.), Phonology Yearbook 3. London: Cambridge University Press, pp. 371-405.

    Google Scholar 

  • Silverman, K. (1987). The Structure and Processing of Fundamental Frequency Contours, PhD thesis, Cambridge University, Cambridge, UK.

    Google Scholar 

  • Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Whightman, C., Price, P., Pierrehumbert, J., and Hirschberg, J. (1992). ToBI: A standard for labeling english prosody. Proceedings of International Conference on Spoken Language Processing, ICSLP'92. Banff, Canada, pp. 867-870.

  • Sorin, C., Larreur, D., and Llorca, R. (1987). Rhythm-based prosodic parser for text-to-speech system in French. Proceedings of the 11th International Congress of Phonetic Sciences. Tallinn, Estonia, USSR, pp. 125-128.

  • Taylor, P. and Black, A. (1998). Assigning phrase breaks from partof-speech sequences. Computer Speeech and Language, 12(2):99-117.

    Google Scholar 

  • Viana, M.C. (1987). Para a S´ntese da Entoac¸ ño em Português, PhD thesis, CLUL-INIC, Lisbon, Portugal.

    Google Scholar 

  • Wang, M.Q. and Hirschberg, J. (1992). Automatic classification of intonational phrase boundaries. Computer Speech and Language, 6:175-196.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Viana, M.C., Oliveira, L.C. & Mata, A.I. Prosodic Phrasing: Machine and Human Evaluation. International Journal of Speech Technology 6, 83–94 (2003). https://doi.org/10.1023/A:1021060308216

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1021060308216

Navigation