Abstract
A set of perception experiments, using reiterant and lexicalised speech, was designed to perform a diagnostic of the relative implication of prosody in the segmentation and hierarchisation of speech. Both natural and synthetic intonation were evaluated. Then, two distance measures—correlation and root-mean-square distance on the acoustic parameters (F0, syllabic duration and intensity)—were applied to match the perception results. This objective vs. subjective comparison underlines which acoustic cues are used by listeners to judge the adequacy of prosody in performing a given function such as demarcation. Results can be summarized by a scale of the perceptual distance between two demarcation functions. This study also points out the ability of listeners to retrieve pertinent information on the basis of pure prosodic stimuli.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aubergé,V. (1991). La Synthèse de la Parole: Des Règles au Lexique. Ph.D. Thesis, Université P. Mendes-France, Grenoble, France.
Aubergé, V. (2002). A gestalt morpholoogy of prosody directed by functions: The example of a step by step model developed at ICP. Proceedings of the First International Conference on Speech Prosody, Aix-en-Provence, France, pp. 151-154.
Aubergé,V. and Bailly,G. (1995). Generation of intonation:Aglobal approach. Proceedings of EuroSpeech'95, Madrid, Spain, vol. 3, pp. 2065-2068.
Baddeley, A.D. (1986). Working Memory. Oxford University Press.
Barbosa, P. and Bailly, G. (1994). Characterisation of rhythmic patterns for text-to-speech synthesis. Speech Communication, 15:127-137.
Campbell, N. (1993a). Automatic detection of prosodic boundaries in speech. Speech Communication, 13:343-354.
Campbell, N. (1993b). Durational cues to prominence and grouping. ESCA Workshop on Prosody, Lund University Working Papers, Lund, Sweden, vol. 41, pp. 38-41.
Campbell, N. (1998). Where is the information in speech? Proceedings of the Third ESCA/COCOSDA International Workshop on Speech Synthesis, Jenolan Caves, Australia, pp. 17-20.
Charpentier, F. and Moulines, E. (1990). Pitch-synchronous waveform processing techniques for text-to-speech using diphones. Speech Communication, 9(5/6):453-467.
Fourcin, A. (1992). Assessment of synthetic speech. In G. Bailly, C. Benoît, and T.R. Sawallis (Eds.), Talking Machines-Theories, Models and Designs, Amsterdam: Elsevier, pp. 431-434.
Gérard, C. and Dolgër, N. (1996). Taille des fenêtres perceptives, empan de la mémoire auditive. XXIème Journées d' Étude de la Parole, Avignon, France, pp. 59-62.
Hirst, D. and Di Cristo, A. (Eds.) (1998). Intonation Systems: A Survey of Twenty Languages. Cambridge University Press.
Hirst, D. and Di Cristo, A. (1998). A survey of intonation systems. In D. Hirst and A. Di Cristo (Eds.), Intonation Systems: A Survey of Twenty Languages, Cambridge University Press, pp. 1-44.
Larkey, L.S. (1983). Reiterant speech: An acoustic and perceptual validation. Journal of the Acoustical Society of America, 73(4):1337-1345.
Liberman, M.Y. and Streeter, L.A. (1978). Use of nonsense-syllable mimicry in the study of prosodic phenomena. Journal of the Acoustical Society of America, 63(1):231-233.
Martin, P. (1980). De la non congruence entre les structures syntaxiques et prosodiques. Travaux de l'Institut de Phonétique d'Aix, vol. 7, pp. 319-339.
Marcus, S.M. (1976). Perceptual Centres. PhD Thesis, University of Cambridge, UK.
Morlec, Y. (1997). Génération Multiparamétrique de la Prosodie du Fran¸cais par Apprentissage Automatique. PhD Thesis, Institut National Polytechnique de Grenoble, France.
Morlec, Y., Rilliard, A., Bailly, G., and Aubergé, V. (1998). Evaluating the adequacy of synthetic prosody in signalling syntactic boundaries: Methodology and first results. Proceedings of the First International Conference on Language Resources and Evaluation. Granada, Spain, pp. 647-650.
Oller, D.K. (1973). The effect of position in utterance on speech segment duration in English. Journal of the Acoustical Society of America, 54(5):1235-1247.
Pagel, V. (1999). De l'Utilisation d'Informations Acoustiques Suprasegmentales en Reconnaissance de la Parole Continue. PhD Thesis, Université Henri Poincaré, Nancy, France.
Rilliard, A. (2000). Vers une Mesure de l'Intelligibilité Linguistique de la Prosodie-Évaluation Diagnostique des Prosodies Synthétique et Naturelle. PhD Thesis, Institut National Polytechnique de Grenoble, France.
Rilliard, A., Aubergé, V., Bailly, G., and Morlec, Y. (1997). Vers une mesure de l'information linguistique véhiculée par la prosodie. Proceeding of FRANCIL'97, Avignon, France, pp. 481-487.
Rilliard, A. and Aubergé, V. (1998). Reiterant speech for the evaluation of natural vs. synthetic prosody. Proceedings of the International Congress on Spoken Language Processing, Sydney, Australia, pp. 675-678.
Rolland, G. (2000). La pertinence psycho-acoustique du syntagme accentuel en fran¸cais. Mémoire de DEA Signal, Image, Parole, Télécoms, Institut National Polytechnique de Grenoble, France.
Scott, S.K. (1993). P-Centres in Speech-An Acoustic Analysis. PhD Thesis, University College, London, UK.
Sonntag, G.P. and Portele, T. (1998). PURR-A method for prosody evaluation and investigation. Computer Speech and Language, 12:437-451.
Stroop, J.R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18:643-662.
Vaissière, J. (1997). Langues, prosodies et syntaxe. Traitement Automatique des Langues, 38(1):53-82.
Van Santen, J.P.H. (1997). Prosodic modeling in text-to-speech synthesis. Proceedings of EuroSpeech'97, Rhodes, Greece, keynote speech, pp. KN-19:28.
Yvon, F., Boula de Mareüil, P., d'Alessandro, C., Aubergé, V., Bagein, M., Bailly, G., Béchet, F., Foukia, S., Goldman, J.F., Keller, E., Oshaughnessy, D., Pagel, V., Sannier, F., Véronis, J., and Zellner, B. (1998). Objective evaluation of grapheme to phoneme conversion for text-to-speech synthesis in French. Computer Speech and Language, 12(4):393-410.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rilliard, A., Aubergé, V. Prosody Evaluation as a Diagnostic Process: Subjective vs. Objective Measurements. International Journal of Speech Technology 6, 409–418 (2003). https://doi.org/10.1023/A:1025717202812
Issue Date:
DOI: https://doi.org/10.1023/A:1025717202812