[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Voice Disguise and Automatic Detection: Review and Perspectives

  • Chapter
Progress in Nonlinear Speech Processing

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4391))

Abstract

This study focuses on the question of voice disguise and its detection. Voice disguise is considered as a deliberate action of the speaker who wants to falsify or to conceal his identity; the problem of voice alteration caused by channel distortion is not presented in this work. A large range of options are open to a speaker to change his voice and to trick a human ear or an automatic system. A voice can be transformed by electronic scrambling or more simply by exploiting intra-speaker variability: modification of pitch, modification of the position of the articulators as lips or tongue which affect the formant frequencies. The proposed work is divided in three parts: the first one is a classification of the different options available for changing one’s voice, the second one presents a review of the different techniques in the literature and the third one describes the main indicators proposed in the literature to distinguish a disguised voice from the original voice, and proposes some perspectives based on disordered and emotional speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice conversion through vector quantization. In: Proc. ICASSP 88, New-York (1988)

    Google Scholar 

  2. Amir, N.: Classifying emotions in speech: a comparison of methods. In: Proceedings EUROSPEECH 2001, Scandinavia (2001)

    Google Scholar 

  3. Baudoin, G., Capman, F., Černocký, J., El Chami, F., Charbit, M., Chollet, G., Petrovska-Delacrétaz, D.: Advances in Very Low Bit Rate Speech Coding Using Recognition and Synthesis Techniques. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 269–276. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Beaugendre, F.: “Modèle de l’intonation pour la synthèse”.1995 de la parole”. In: Fondements et perspectives en traitement automatique de la parole, Aupelf-Uref, edn. (1995)

    Google Scholar 

  5. Bimbot, F., Chollet, G., Deleglise, P., Montacié, C.: Temporal Decomposition and Acoustic-phonetic Decoding of Speech. In: Proc. ICASSP 88, New-York, pp. 445–448 (1988)

    Google Scholar 

  6. Blomberg, M., Elenius, D., Zetterholm, E.: Speaker verification scores and acoustics analysis of a professional impersonator. In: Proc. FONETIK (2004)

    Google Scholar 

  7. Blouet, R., Mokbel, C., Chollet, G.: BECARS: a free software for speaker recognition. In: ODYSSEY 2004, Toledo (2004)

    Google Scholar 

  8. Boersma, P., Weenink, D.: PRAAT: doing phonetics by computer, http://www.praat.org

  9. Cappe, O., Stylianou, Y., Moulines, E.: Statistical methods for voice quality transformation. In: Proc. of EUROSPEECH 95, Madrid (1995)

    Google Scholar 

  10. Chollet, G., Cernocky, J., Constantinescu, A., Deligne, S., Bimbot, F.: Toward ALISP: a proposal for Automatic Language Independent Speech Processing. In: Computational Models of Speech Processing. NATO ASI Series (1997)

    Google Scholar 

  11. Delvaux, V., Metens, T., Soquet, A.: French nasal vowels: articulary and acoustic properties. In: Proc. Of the 7th ICSLP, vol. 1, Denver, pp. 53–56 (2002)

    Google Scholar 

  12. Dutoit, T.: High quality text to speech synthesis: a comparison of four candidates algorithms. In: Proc. ICASSP 1994, vol. 1, Adelaïde, Australie, pp. 565–568 (1994)

    Google Scholar 

  13. de Figueiredo, R.M., de Souza Britto, H.: A report on the acoustic effects of one type of disguise. Forensic Linguistics 3(1), 168–175 (1996)

    Google Scholar 

  14. Genoud, D., Chollet, G.: Voice transformations: some tools for the imposture of speaker verification systems. In: Braun, A. (ed.) Advances in Phonetics, Franz Steiner Verlag, Stuttgart (1999)

    Google Scholar 

  15. Gibbon, D., Gut, U.: Measuring speech rhythm. In: Proc. Eurospeech 2001, Scandinavia (2001)

    Google Scholar 

  16. Endres, W., Balbach, W., Flösser, G.: Voice spectrograms as a function of age, voice disguise and voice imitation. Journal of the Acosutical Society of America 49, 1842–1848 (1971)

    Article  Google Scholar 

  17. Gu, L., Harris, J.G., Shrivastav, R., Sapienza, C.: Disordered speech evaluation using objective quality measures. In: Proc. ICASSP 2005, Philadelphia (2005)

    Google Scholar 

  18. Hall, M.: Spectrographic analysis of interspeaker and intraspeaker variability of professional mimicry. MA dissertation, Michigan State University (1975)

    Google Scholar 

  19. Künzel, H.J.: Effects of voice disguise on fundamental frequency. Forensic linguistics 7, 149–179 (2000)

    Article  Google Scholar 

  20. Künzel, H., Gonzalez-Rodriguez, J., Ortega-Garcia, J.: Effect of voice disguise on the performance of a forensic automatic speaker recognition system. In: Proc. Odyssey (2004)

    Google Scholar 

  21. Hirson, A., Duckworth, M.: Glottal fry and voice disguise: a case study in forensic phonetics. Journal of Biomedical Enginering 15, 193–200 (1993)

    Article  Google Scholar 

  22. Jiang, D., Zhang, W., Shen, L., Cai, L.: Prosody analysis and modelling for emotional speech synthesis. In: Proc. ICASSP 2005, Philadelphia (2005)

    Google Scholar 

  23. Kain, A., Macon, M.W.: Spectral voice conversion for text to speech synthesis. In: Proc. ICASSP 98, New York (1998)

    Google Scholar 

  24. Kain, A., Macon, M.W.: Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. In: Proc. ICASSP 01, Salt Lake City (2001)

    Google Scholar 

  25. Lummis, R.C., Rosenberg, A.E.: Test of an automatic speaker verification method with intensively trained professional mimics. Journal of Acoustical Society of America 9(1) (1972)

    Google Scholar 

  26. Masthoff, H.: A report on voice disguise experiment. Forensic Linguistics 3(1), 160–167 (1996)

    Google Scholar 

  27. Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection task performance. In: Proc. EUROSPEECH 97, Rhodes, Greece, pp. 1895–1898 (1997)

    Google Scholar 

  28. Melvaldova, J.: Caractéristiques temporelle de la parole imitée. In: Proceedings JEP, Journées d’Etudes sur la Parole (2004)

    Google Scholar 

  29. Moosmüller, S.: The influence of creaky voice on formant frequency changes. The International Journal of Speech, Language and the Law 8(1) (2001)

    Google Scholar 

  30. Moulines, E., Charpentier, F.: Pitch synchronous waveform processing techniques for text to speech synthesis using diphone. Speech comm. 9, 453–497

    Google Scholar 

  31. Ochard, T., Yarmey, A.: The effects of whispers, voice sample duration and voice distinctiveness on criminal Speaker Identification. Appl. Cogn. Psychol. 9(3), 249–260 (1995)

    Article  Google Scholar 

  32. Perrot, P., Aversano, G., Blouet, R., Charbit, M., Chollet, G.: Voice forgery using ALISP. In: Proc. ICASSP 2005, Philadelphie (2005)

    Google Scholar 

  33. Rodman, R.: Speaker Recognition of disguised voices: a program for research. In: Consortium on Speech Technology Conference on Speaker by man and machine: direction for forensic applications, COST 250, Ankara, Turkey (1998)

    Google Scholar 

  34. Valbret, H., Moulines, E., Tubach, J.P.: Voice trans-formation using PSOLA technique. In: Proc. ICASSP 92, San Francisco (1992)

    Google Scholar 

  35. Shafran, I., Mohri, M.: A comparison of classifiers for detecting emotion from speech. In: Proc. ICASSP 2005, Philadelphia (2005)

    Google Scholar 

  36. Stylianou, Y., Cappe, O.: A system for voice conversion based on probabilistic classification and a harmonic plus noise model. In: Proc ICASSP 98, Seattle, WA, pp. 281–284 (1998)

    Google Scholar 

  37. Stylianou, Y., Cappe, O., Moulines, E.: Continuous probalistic transform for voice conversion. IEEE Trans. Speech and Audio Processing 6(2), 131–142 (1998)

    Article  Google Scholar 

  38. Zetterholm, E.: Voice Imitation. A phonetic study of perceptual illusions and acoustic success. Dissertation, Department of Linguistics and Phonetics, Lund University (2003)

    Google Scholar 

  39. Rostolland, D.: Acoustic features of shouted voice. Acustica 50, 118–125 (1982a)

    Google Scholar 

  40. Rostolland, D.: Phonetic structure of shouted voice. Acustica 51, 80–89 (1982b)

    Google Scholar 

  41. Rostolland, D.: Intelligibility of shouted voice. Acoustica 57, 103–121 (1985)

    Google Scholar 

  42. Abboud, B., Bredin, H., Aversano, G., Chollet, G.: Audio visual forgery in identity verification. In: Workshop on Nonlinear Speech Processing, Heraklion, Crete, 20-23 Sep (2005)

    Google Scholar 

  43. Atal, B.S.: Automatic speaker recognition based on pitch contours. Journal of Acoustical Society of America 52, 1687–1697 (1972)

    Article  Google Scholar 

  44. Zalewski, J., Maljewski, W., Hollien, H.: Cross correlation between Long-term speech Spectra as a criterion for speaker identification. Acoustica 34, 20–24 (1975)

    Google Scholar 

  45. http://www.zdnet.fr/telecharger/windows/fiche/0,39021313,11009007s,00.htm

Download references

Author information

Authors and Affiliations

Authors

Editor information

Yannis Stylianou Marcos Faundez-Zanuy Anna Esposito

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this chapter

Cite this chapter

Perrot, P., Aversano, G., Chollet, G. (2007). Voice Disguise and Automatic Detection: Review and Perspectives. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds) Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science, vol 4391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71505-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71505-4_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71503-0

  • Online ISBN: 978-3-540-71505-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics