[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3444884.3444890acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbeConference Proceedingsconference-collections
research-article

The Study of Voice Pathology Detection based on MFCC and SVM

Published: 31 March 2021 Publication History

Abstract

Subjective auditory perception evaluation of voice is the most simple and direct method for judgment of the degree of voice lesions and the treatment effect. But it is closely related to the clinical experience of doctors. Recently, some voice automatic diagnosis methods based on voice feature parameters and classification algorithms have been proposed. Mel Frequency Cepstral Coefficient (MFCC) is the most commonly used feature parameter. However, it is not clear the role of MFCC dynamic features in improving diagnosis results. This study adopted the features of MFCC, MFCC + ΔMFCC, and MFCC + ΔMFCC + ΔΔMFCC respectively, combined with the Support Vector Machine (SVM) method to further determine whether adding dynamic MFCC features can improve the accuracy of pathological voice detection. The results showed that no matter whether dynamic features were added or not, the accuracy rate and specificity have not changed significantly. This means the dynamic change of the MFCC characteristic parameters is slight at least for vowel vocalization. This study may provide useful information for pathological voice diagnosis based on vowel vocalization.

References

[1]
Verdolini, K. and Ramig, L.O. 2001. Review: occupational risks for voice problems. Logop. Phoniatr. Voco. 26, 1 (Jul. 2001), 37-46. DOI= https://doi.org/10.1080/14015430119969.
[2]
Stemple, J.C., Roy, N. and Klaben, B.K. 2014. Clinical Voice Pathology Theory and Management. San Diego, Plural Publishing.
[3]
Crowe, K., Masso, S. and Hopf, S. 2018. Innovations actively shaping speech-language pathology evidence-based practice Int. J. Speech. Lang. Pathol. 20, 3(Jun. 2018), 297-299. Doi= https://doi.org/10.1080/17549507.2018.1462851.
[4]
Szklanny, K., Gubrynowicz, R., Ratyńska, J., Chojnacka-Wądołowska, D., 2019. Electroglottographic and acoustic analysis of voice in children with vocal nodules. Int. J. Pediatr. Otorhinolaryngol. 122(Apr. 2019), 82-88. Doi= https://doi.org/ 10.1016/j.ijporl.2019.03.030.
[5]
Yu, P.C., Gao, N., Li, X.M., The diagnostic value of laryngeal electromyography in vocal fold paralysis and arytenoid dislocation. Journal of Clinical Otorhinolaryngology Head and Neck Surgery. 32, 6 (2018), 420-423. DOI= https://
[6]
Ongkasuwan, J., Devore, D., Hollas, S., 2017. Laryngeal ultrasound and pediatric vocal fold nodules. Laryngoscope. 127, 3 (2017), 676-678. DOI= https://doi.org/ 10.1002/lary.26209.
[7]
Alnasheri, A., Muhammad, G., Alsulaiman, M., 2017. Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions. J. Voice. 31, 1 (Jan. 2017), 3-15. DOI= http://
[8]
Martinez, D., Lleida, E., Ortega, A., 2012. Voice pathology detection on the saarbrücken voice database with calibration and fusion of scores using multifocal toolkit. Comm. Com. Inf. Sc. Springer, 99-109.
[9]
Majidnezhad, V. and Kheidorov, I. 2013. An ANN-based method for detecting vocal fold pathology. Int. J. Comput. Appl. 62, 7 (Jan. 2013), 1-4. DOI= https://doi.org/10.5120/10089-4722.
[10]
Muhammad, G., Alhamid, M.F., Hossain, M.S., 2017. Enhanced Living by Assessing Voice Pathology Using a Co-Occurrence Matrix. Sensors-Basel. 17, 2 (Jan. 2017), 267. DOI= https://doi.org/10.3390/s17020267.
[11]
Chuang, Z.Y., Yu, X.T., Chen, J.Y., 2018. DNN-based Approach to Detect and Classify Pathological Voice. IEEE International Conference on Big Data. Seattle, WA, 5238-5241. DOI= https://doi.org/10.1109/BigData.2018.8622317.
[12]
Kadiri, S.R. and Alku, P. 2019. Mel-Frequency Cepstral Coefficients of Voice Source Waveforms for Classification of Phonation Types in Speech. Proc. Interspeech. 2019, 2508-2512, DOI= https://doi.org/10.21437/Interspeech.2019-2863.
[13]
Chin, K.O., Pandiyan, P.M., Yaacob, S., 2006. Mel-frequency cepstral coefficient analysis in speech recognition. 2006 International Conference on Computing & Informatics. (June. 2006), 1-5, DOI=https://doi.org/ 10.1109/ICOCI.2006.5276486.
[14]
Jeancolas, L., Benali, H., Benkelfat, B.E., 2017. Automatic detection of early stages of Parkinson's disease through acoustic voice analysis with mel-frequency cepstral coefficients. International Conference on Advanced Technologies for Signal and Image Processing. (May. 2017), 1-6, DOI=https://doi.org/10.1109/ATSIP.2017.8075567.
[15]
Vapnik, V.N. 1999. An overview of statistical learning theory. IEEE Trans Neural Netw. 10, 5 (Sep 1999), 988-999. DOI= https://doi.org/10.1109/72.788640.
[16]
David, V. 2003. Advanced support vector machines and kernel methods. Neurocomputing. 55, 1-2 (Sep 2003), 5-20. DOI= https://doi.org/ 10.1016/S0925-2312(03)00373-4.
[17]
Bennett, K. and Campbell, C. 2000. Support vector machines: hype or hallelujah? Sigkdd Explor. 2, 2 (Dec 2000), 1-13. DOI= https://doi.org/ 10.1145/380995.380999.
[18]
Shen, X.H., Wan, R.C. and Zhang, X.Y. 2015. A Speaker Voice Recognition System of Improved Dynamic Characteristic Parameters. Computer simulation. 32, 4, 154-158.

Cited By

View all
  • (2024)Automated Voice Pathology Diagnosis Through Feature Extraction2024 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS)10.1109/ICPECTS62210.2024.10780102(1-6)Online publication date: 8-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBBE '20: Proceedings of the 2020 7th International Conference on Biomedical and Bioinformatics Engineering
November 2020
197 pages
ISBN:9781450388221
DOI:10.1145/3444884
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 March 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Mel frequency cepstral coefficient
  2. Voice pathology
  3. automatic diagnosis
  4. support vector machine

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Friendship Seeds Project of Beijing Friendship Hospital, Captial Medical University

Conference

ICBBE '20

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Automated Voice Pathology Diagnosis Through Feature Extraction2024 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS)10.1109/ICPECTS62210.2024.10780102(1-6)Online publication date: 8-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media