Abstract
Stuttering is a common problem in childhood that may persist into adulthood if not treated in early stages. Techniques from spoken language understanding may be applied to provide automated diagnosis of stuttering from children speech. The main challenges however lie in the lack of training data and the high dimensionality of this data. This study investigates the applicability of machine learning approaches for detecting stuttering events in transcripts. Two machine learning approaches were applied, namely HELM and CRF. The performance of these two approaches are compared, and the effect of data augmentation is examined in both approaches. Experimental results show that CRF outperforms HELM by 2.2% in the baseline experiments. Data augmentation helps improve systems performance, especially for rarely available events. In addition to the annotated augmented data, this study also adds annotated human transcriptions from real stuttered children’s speech to help expand the research in this field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brundage, S.B., Bothe, A.K., Lengeling, A.N., Evans, J.J.: Comparing judgments of stuttering made by students, clinicians, and highly experienced judges. J. Fluen. Dis. 31(4), 271–283 (2006)
Craig, A., Calver, P.: Following up on treated stutterers studies of perceptions of fluency and job status. J. Speech Lang. Hear. Res. 34(2), 279–284 (1991)
Geetha, Y., Pratibha, K., Ashok, R., Ravindra, S.K.: Classification of childhood disfluencies using neural networks. J. Fluen. Dis. 25(2), 99–117 (2000)
Gregory, H.H., Campbell, J.H., Gregory, C.B., Hill, D.G.: Stuttering Therapy: Rationale and Procedures. Allyn & Bacon, Boston (2003)
Hayhow, R., Cray, A.M., Enderby, P.: Stammering and therapy views of people who stammer. J. Fluen. Dis. 27(1), 1–17 (2002)
Heeman, P.A., Lunsford, R., McMillin, A., Yaruss, J.S.: Using clinician annotations to improve automatic speech recognition of stuttered speech. In: Interspeech 2016, pp. 2651–2655 (2016)
Heeman, P.A., McMillin, A., Yaruss, J.S.: Computer-assisted disfluency counts for stuttered speech. In: INTERSPEECH, pp. 3013–3016 (2011)
Howell, P., Davis, S., Bartrip, J.: The university college London archive of stuttered speech (uclass). J. Speech Lang. Hear. Res. 52(2), 556–569 (2009)
Howell, P., Sackin, S.: Automatic recognition of repetitions and prolongations in stuttered speech. In: Proceedings of the first World Congress on fluency disorders, vol. 2, pp. 372–374 (1995)
Jaitly, N., Hinton, G.E.: Vocal tract length perturbation (VTLP) improves speech recognition. In: Proceeding ICML Workshop on Deep Learning for Audio, Speech and Language (2013)
Jones, M., Onslow, M., Packman, A., Williams, S., Ormond, T., Schwarz, I., Gebski, V.: Randomised controlled trial of the lidcombe programme of early stuttering intervention. BMJ 331(7518), 659 (2005). http://www.bmj.com/content/331/7518/659
Kudoh, T.: CRF++ (2007). https://sourceforge.net/projects/crfpp/
Liao, H., Pundak, G., Siohan, O., Carroll, M., Coccaro, N., Jiang, Q.M., Sainath, T.N., Senior, A., Beaufays, F., Bacchiani, M.: Large vocabulary automatic speech recognition for children. In: Interspeech (2015)
Liu, H., Gegov, A., Cocea, M.: Complexity control in rule based models for classification in machine learning context. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds.) Advances in Computational Intelligence Systems. AISC, vol. 513, pp. 125–143. Springer, Cham (2017). doi:10.1007/978-3-319-46562-3_9
Mahesha, P., Vinod, D.S.: Using orthographic transcripts for stuttering dysfluency recognition and severity estimation. In: Jain, L.C., Patnaik, S., Ichalkaranje, N. (eds.) Intelligent Computing, Communication and Devices. AISC, vol. 308, pp. 613–621. Springer, New Delhi (2015). doi:10.1007/978-81-322-2012-1_66
Juste, F.S., de Andrade, C.R.F.: Speech disfluency types of fluent and stuttering individuals: age effects. Folia Phoniatr. et Logop. 63(2), 57–64 (2010)
Stolcke, A., Shriberg, E.: Statistical language modeling for speech disfluencies. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-1996, vol. 1, pp. 405–408. IEEE (1996)
Stolcke, A., et al.: SRILM-an extensible language modeling toolkit. In: Interspeech, pp. 901–904 (2002)
Tseng, H., Chang, P., Andrew, G., Jurafsky, D., Manning, C.: A conditional random field word segmenter for sighan bakeoff 2005. In: Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, vol. 171. Citeseer (2005)
Vertanen, K.: Csr lm-1 language model training recipe (2007)
Yairi, E., Ambrose, N.G.: Early childhood stuttering ipersistency and recovery rates. J. Speech Lang. Hear. Res. 42(5), 1097–1112 (1999)
Acknowledgments
This research has been supported by the Saudi Ministry of Education, King Saud University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Alharbi, S., Hasan, M., Simons, A.J.H., Brumfitt, S., Green, P. (2017). Detecting Stuttering Events in Transcripts of Children’s Speech. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-68456-7_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68455-0
Online ISBN: 978-3-319-68456-7
eBook Packages: Computer ScienceComputer Science (R0)