Abstract
Speech traits have enabled the evaluation and monitoring of the neurological state of different disorders, including Parkinson’s Disease (PD) using classical and deep approaches. Considering that speech contains paralinguistic information, the native language of the speaker influences the performance of the trained models when classifying the presence of the disease. Although researchers have performed several studies using corpora from different acoustic and language conditions, there is no baseline for the accuracy of a system to classify PD in cross-language scenarios. This study evaluates the generalization capability of different classical and deep methods to discriminate between PD patients and healthy speakers. The experiments are performed in cross-language scenarios. In particular, an Active Learning (AL) strategy is considered to evaluate the influence of the training data selection to improve the model’s performance under cross-language settings. The results indicate that models based on Wav2Vec 2.0 yielded the best results in detecting the presence of the disease in such non-controlled cross-language scenarios. In addition, the AL selection outperformed the results compared to a random selection of training samples. The considered AL based-approach allows to achieve high accuracies using a careful selection of training data in an adaptively manner. This is particularly important when dealing with non-annotated and limited data, such as the case of pathological speech modeling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdelwahab, M., Busso, C.: Active learning for speech emotion recognition using deep neural network. In: Proceedings of ACII, pp. 1–7. IEEE (2019)
Baevski, A., et al.: wav2vec 2.0: a framework for self-supervised learning of speech representations. In: Advances in Neural Information Processing Systems, vol. 33, pp. 12449–12460 (2020)
Bocklet, T., et al.: Detection of persons with Parkinson’s disease by acoustic, vocal, and prosodic analysis. In: Proceedings of ASRU, pp. 478–483 (2011)
Bocklet, T., et al.: Automatic evaluation of Parkinson’s speech-acoustic, prosodic and voice related cues. In: Proceedings of INTERSPEECH, pp. 1149–1153 (2013)
El Maachi, I., et al.: Deep 1d-convnet for accurate Parkinson disease detection and severity prediction from gait. Expert Syst. Appl. 143, 113075 (2020)
Goetz, C.G., et al.: Movement disorder society-sponsored revision of the unified Parkinson’s disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results. Mov. Disord. 23(15), 2129–2170 (2008)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Jankovic, J.: Parkinson’s disease: clinical features and diagnosis. J. Neurol. Neurosurg. Psychiatry 79(4), 368–376 (2008)
Karan, B., Sekhar, S., Orozco-Arroyave, J.R.: Non-negative matrix factorization-based time-frequency feature extraction of voice signal for Parkinson’s disease prediction. Comput. Speech Lang. 69, 1–17 (2021)
Kim, D., Kang, P.: Cross-modal distillation with audio-text fusion for fine-grained emotion classification using BERT and wav2vec 2.0. Neurocomputing 506, 168–183 (2022)
Makiuchi, M.R., et al.: Multimodal emotion recognition with high-level speech and text features. In: Proceedings of ASRU, pp. 350–357. IEEE (2021)
Malhotra, K., et al.: Active learning methods for low resource end-to-end speech recognition. In: Proceeding of INTERSPEECH, pp. 2215–2219 (2019)
Mallela, J., et al.: Voice based classification of patients with amyotrophic lateral sclerosis, Parkinson’s disease and healthy controls with CNN-LSTM using transfer learning. In: Proceedings of ICASSP, pp. 6784–6788. IEEE (2020)
Orozco-Arroyave, J.R.: Analysis of Speech of People with Parkinson’s Disease. Logos Verlag Berlin GmbH (2015)
Orozco-Arroyave, J.R., et al.: New Spanish speech corpus database for the analysis of people suffering from Parkinson’s disease. In: Proceedings of LREC, pp. 342–347 (2014)
Ozbolt, A.S., et al.: Things to consider when automatically detecting Parkinson’s disease using the phonation of sustained vowels: analysis of methodological issues. Appl. Sci. 12(3), 991 (2022)
Quan, C., et al.: A deep learning based method for Parkinson’s disease detection using dynamic features of speech. IEEE Access 9, 10239–10252 (2021)
Rios-Urrego, C.D., Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Nöth, E.: Is there any additional information in a neural network trained for pathological speech classification? In: Ekštein, K., Pártl, F., Konopík, M. (eds.) TSD 2021. LNCS (LNAI), vol. 12848, pp. 435–447. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-83527-9_37
Rusz, J.: Detecting speech disorders in early Parkinson’s disease by acoustic analysis. Habilitation thesis, Czech Technical University in Prague (2018)
Rusz, J., et al.: Objective acoustic quantification of phonatory dysfunction in Huntington’s disease. PLoS ONE 8(6), e65881 (2013)
Rusz, J., et al.: Characteristics and occurrence of speech impairment in Huntington’s disease: possible influence of antipsychotic medication. J. Neural Transm. 121(12), 1529–1539 (2014)
Sakar, B.E., et al.: Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings. IEEE J. Biomed. Health Inform. 17(4), 828–834 (2013)
Settles, B.: Uncertainty sampling, pp. 11–20 (2012)
Spencer, K.A., Rogers, M.A.: Speech motor programming in hypokinetic and ataxic dysarthria. Brain Lang. 94(3), 347–366 (2005)
Vásquez-Correa, J.C., et al.: Towards an automatic evaluation of the dysarthria level of patients with Parkinson’s disease. J. Commun. Disord. 76, 21–36 (2018)
Vasquez-Correa, J.C., et al.: End-2-end modeling of speech and gait from patients with Parkinson’s disease: comparison between high quality vs. smartphone data. In: Proceedings of ICASSP, pp. 7298–7302. IEEE (2021)
Vásquez-Correa, J.C., et al.: Transfer learning helps to improve the accuracy to classify patients with different speech disorders in different languages. Pattern Recogn. Lett. 150, 272–279 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Moreno-Acevedo, S.A., Rios-Urrego, C.D., Vásquez-Correa, J.C., Rusz, J., Nöth, E., Orozco-Arroyave, J.R. (2023). Language Generalization Using Active Learning in the Context of Parkinson’s Disease Classification. In: Ekštein, K., Pártl, F., Konopík, M. (eds) Text, Speech, and Dialogue. TSD 2023. Lecture Notes in Computer Science(), vol 14102. Springer, Cham. https://doi.org/10.1007/978-3-031-40498-6_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-40498-6_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40497-9
Online ISBN: 978-3-031-40498-6
eBook Packages: Computer ScienceComputer Science (R0)