[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

50 Shades of Gray: Effect of the Color Scale for the Assessment of Speech Disorders

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2022)

Abstract

Spectrograms provide a visual representation of the time-frequency variations of a speech signal. Furthermore, the color scales can be used as a pre-processing normalization step. In this study, we investigated the suitability of using different color scales for the reconstruction of spectrograms together with bottleneck features extracted from Convolutional AutoEncoders (CAEs). We trained several CAEs considering different parameters such as the number of channels, wideband/narrowband spectrograms, and different color scales. Additionally, we tested the suitability of the proposed CAE architecture for the prediction of the severity of Parkinson’s Disease (PD) and for the nasality level in children with Cleft Lip and Palate (CLP). The results showed that it is possible to estimate the neurological state for PD with Spearman’s correlations of up to 0.71 using the Grayscale, and the nasality level in CLP with F-scores of up to 0.58 using the raw spectrogram. Although the color scales improved performance in some cases, it is not clear which color scale is the most suitable for the selected application, as we did not find significant differences in the results for each color scale.

T. Arias-Vergara—Work done during Ph.D. studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Amiriparian, S., et al.: Snore sound classification using image-based deep spectrum features. In: Proceedings of the Interspeech 2017, pp. 3512–3516 (2017). https://doi.org/10.21437/Interspeech.2017-434

  2. Barrett, P., Hunter, J., Miller, J.T., Hsu, J.C., Greenfield, P.: matplotlib-a portable python plotting package. In: Astronomical Data Analysis Software and Systems XIV, vol. 347, p. 91 (2005)

    Google Scholar 

  3. Bhidayasiri, R., Tarsy, D.: Parkinson’s disease: Hoehn and Yahr scale. In: Movement Disorders: a Video Atlas. CCN, pp. 4–5. Humana Press, Totowa, NJ (2012). https://doi.org/10.1007/978-1-60327-426-5_2

  4. Carvajal-Castaño, H.A., Orozco-Arroyave, J.R.: Articulation analysis in the speech of children with cleft lip and palate. In: Nyström, I., Hernández Heredia, Y., Milián Núñez, V. (eds.) CIARP 2019. LNCS, vol. 11896, pp. 575–585. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33904-3_54

    Chapter  Google Scholar 

  5. Cernak, M., Orozco-Arroyave, J.R., Rudzicz, F., Christensen, H., Vásquez-Correa, J.C., Nöth, E.: Characterisation of voice quality of Parkinson’s disease using differential phonological posterior features. Comput. Speech Lang. 46, 196–208 (2017)

    Article  Google Scholar 

  6. Cummins, N., Amiriparian, S., Hagerer, G., Batliner, A., Steidl, S., Schuller, B.W.: An image-based deep spectrum feature representation for the recognition of emotional speech. In: Proceedings of the 25th ACM international conference on Multimedia, pp. 478–484 (2017)

    Google Scholar 

  7. Dennis, J., Tran, H.D., Li, H.: Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Process. Lett. 18(2), 130–133 (2010)

    Article  Google Scholar 

  8. Dodderi, T., Narra, M., Varghese, S.M., et al.: Spectral analysis of hypernasality in cleft palate children: a pre-post surgery comparison. J. Clin. Diagn. Res. JCDR 10(1), MC01 (2016)

    Google Scholar 

  9. Duffy, J.R.: Motor Speech Disorders: Substrates, Differential Diagnosis, and Management. Elsevier Health Science (2013)

    Google Scholar 

  10. Garcia, N., Orozco-Arroyave, J.R., D’Haro, L.F., Dehak, N., Nöth, E.: Evaluation of the neurological state of people with Parkinson’s disease using i-vectors. In: Interspeech, pp. 299–303 (2017)

    Google Scholar 

  11. Goetz, C.G., et al.: Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord. 23(15), 2129–2170 (2008)

    Article  Google Scholar 

  12. Golabbakhsh, M., Abnavi, F., Kadkhodaei Elyaderani, M., et al.: Automatic identification of hypernasality in normal and cleft lip and palate patients with acoustic analysis of speech. J. Acoust. Soc. Am. 141(2), 929–935 (2017)

    Article  Google Scholar 

  13. Hernández-Mena, C.D., Herrera-Camacho, J.A.: CIEMPIESS: a new open-sourced mexican spanish radio corpus. In: LREC, vol. 14, pp. 371–375 (2014)

    Google Scholar 

  14. Hornykiewicz, O.: Biochemical aspects of Parkinson’s disease. Neurology 51(2 Suppl 2), S2–S9 (1998)

    Article  Google Scholar 

  15. Kummer, A.W.: Cleft Palate and Craniofacial Anomalies: Effects on Speech and Resonance. Nelson Education (2013)

    Google Scholar 

  16. Maier, A., Hönig, F., Bocklet, T., et al.: Automatic detection of articulation disorders in children with cleft lip and palate. J. Acoust. Soc. Am. 126(5), 2589–2602 (2009)

    Article  Google Scholar 

  17. Mossey, P.A., Catilla, E.E., et al.: Global registry and database on craniofacial anomalies: report of a WHO registry meeting on craniofacial anomalies (2003)

    Google Scholar 

  18. Orozco-Arroyave, J.R., Arias-Londoño, J.D., Vargas-Bonilla, J.F., Gonzalez-Rátiva, M.C., Nöth, E.: New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease. In: LREC, pp. 342–347 (2014)

    Google Scholar 

  19. Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: Librispeech: an ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210. IEEE (2015)

    Google Scholar 

  20. Pérez-Toro, P.A., et al.: Emotional state modeling for the assessment of depression in Parkinson’s disease. In: Ekštein, K., Pártl, F., Konopík, M. (eds.) TSD 2021. LNCS (LNAI), vol. 12848, pp. 457–468. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-83527-9_39

    Chapter  Google Scholar 

  21. Vásquez-Correa, J.C., Arias-Vergara, T., Schuster, M., Orozco-Arroyave, J.R., Nöth, E.: Parallel representation learning for the classification of pathological speech: studies on Parkinson’s disease and cleft lip and palate. Speech Commun. 122, 56–67 (2020)

    Article  Google Scholar 

  22. Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Bocklet, T., Nöth, E.: Towards an automatic evaluation of the dysarthria level of patients with Parkinson’s disease. J. Commun. Disord. 76, 21–36 (2018)

    Article  Google Scholar 

  23. Williams, A.C., Bearn, D., Mildinhall, S., et al.: Cleft lip and palate care in the United Kingdom-the Clinical Standards Advisory Group (CSAG) Study. Part 2: dentofacial outcomes and patient satisfaction. Cleft Palate-Craniofac. J. 38(1), 24–29 (2001)

    Google Scholar 

  24. Wyatt, R., Sell, D., Russell, J., Harding, A., Harland, K., Albery, L.: Cleft palate speech dissected: a review of current knowledge and analysis. Br. J. Plast. Surg. 49(3), 143–149 (1996)

    Article  Google Scholar 

  25. Yang, C.C., Chung, Y.M., Chi, L.Y., Chen, H.H., Wang, Y.T.: Analysis of verbal diadochokinesis in normal speech using the diadochokinetic rate analysis program. J. Dent. Sci. 6(4), 221–226 (2011)

    Article  Google Scholar 

  26. Zahid, L., et al.: A spectrogram-based deep feature assisted computer-aided diagnostic system for Parkinson’s disease. IEEE Access 8, 35482–35495 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work was funded by the European Union’s Horizon 2020 research and innovation programme under Marie Sklodowska-Curie grant agreement No. 766287, and partially funded by CODI at UdeA grant # PRG2020-34068. T. Arias-Vergara is under grants of Convocatoria Doctorado Nacional-785 financed by COLCIENCIAS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paula Andrea Pérez-Toro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pérez-Toro, P.A. et al. (2022). 50 Shades of Gray: Effect of the Color Scale for the Assessment of Speech Disorders. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2022. Lecture Notes in Computer Science(), vol 13502. Springer, Cham. https://doi.org/10.1007/978-3-031-16270-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16270-1_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16269-5

  • Online ISBN: 978-3-031-16270-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics