[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3079452.3079492acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdhConference Proceedingsconference-collections
short-paper

Speech-based Diagnosis of Autism Spectrum Condition by Generative Adversarial Network Representations

Published: 02 July 2017 Publication History

Abstract

Machine learning paradigms based on child vocalisations show great promise as an objective marker of developmental disorders such as Autism. In conventional detection systems, hand-crafted acoustic features are usually fed into a discriminative classifier (e.g, Support Vector Machines); however it is well known that the accuracy and robustness of such a system is limited by the size of the associated training data. This paper explores, for the first time, the use of feature representations learnt using a deep Generative Adversarial Network (GAN) for classifying children's speech affected by developmental disorders. A comparative evaluation of our proposed system with different acoustic feature sets is performed on the Child Pathological and Emotional Speech database. Key experimental results presented demonstrate that GAN based methods exhibit competitive performance with the conventional paradigms in terms of the unweighted average recall metric.

References

[1]
American Psychiatric Association. Diagnostic and statistical manual of mental disorders. Washington, D.C., 4th edition, 2000.
[2]
Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798--1828, 2013.
[3]
D. Bone, T. Chaspari, K. Audkhasi, J. Gibson, A. Tsiartas, M. V. Segbroeck, M. Li, S. Lee, and S. Narayanan. Classifying language-related developmental disorders from speech cues: the promise and the potential confounds. In ISCA, editor, Proceedings of INTERSPEECH, pages 182--186, Lyon, France, 2013.
[4]
D. Bone, C.-C. Lee, M. Black, M. Williams, S. Lee, P. Levitt, and S. Narayanan. The psychologist as an interlocutor in autism spectrum disorder assessment: Insights from a study of spontaneous prosody. Journal of Speech, Language, and Hearing Research, 57(4):1162--1177, 2014.
[5]
M. Carpenter, M. Tomasello, and T. Striano. Role reversal imitation and language in typically developing infants and children with autism. Infancy, 8(3):253--278, 2005.
[6]
N. Davis and A. Carter. Parenting stress in mothers and fathers of toddlers with autism spectrum disorders: Associations with child characteristics. Journal of Autism and Developmental Disorders, 38(7):1278--1291, 2008.
[7]
J. Deng, Z. Zhang, F. Eyben, and B. Schuller. Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 21(9):1068--1072, 2014.
[8]
J. Deng, Z. Zhang, E. Marchi, and B. Schuller. Sparse autoencoder-based feature transfer learning for speech emotion recognition. In Proceedings 5th International Conference on Affective Computing and Intelligent Interaction, pages 511--516, Geneva, Switzerland, 2013.
[9]
F. Eyben, K. Scherer, B. Schuller, J. Sundberg, E. André, C. Busso, L. Devillers, J. Epps, P. Laukka, S. Narayanan, and K. Truong. The geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2):190--202, 2016.
[10]
F. Eyben, F. Weninger, F. Groß, and B. Schuller. Recent developments in openSMILE, the munich open-source multimedia feature extractor. In Proceedings 21st ACM International Conference on Multimedia, pages 835--838, Barcelona, Spain, 2013. ACM.
[11]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, and Y. Courville, A.and Bengio. Generative adversarial nets. In Proceedings Neural Information Processing Systems, pages 2672--2680, Montreal, QC, Canada, 2014.
[12]
L. Kanner. Autistic disturbances of affective contact. The nervous child, 2:217--250, 1943.
[13]
M. Kjelgaard and H. Tager-Flusberg. An investigation of language impairment in autism: Implications for genetic subgroups. Language and Cognitive Processes, 16(2--3):287--308, 2001.
[14]
M. Kjelgaard and H. Tager-Flusberg. Update on the language disorders of individuals on the autistic spectrum. Brain & Development, 25(3):166--172, 2003.
[15]
A. Le Couteur, G. Haden, D. Hammal, and H. McConachie. Diagnosing autism spectrum disorders in pre-school children using two standardised assessment instruments: The ADI-R and the ADOS. Journal of Autism and Developmental Disorders, 38(2):362--372, 2008.
[16]
Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521:436--444, 2015.
[17]
R. Lotfian and C. Busso. Emotion recognition using synthetic speech as neutral reference. In Proceedings 40th IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4759--4763, Brisbane, QLD, Australia, 2015.
[18]
E. Marchi, B. Schuller, S. Baron-Cohen, O. Golan, S. Bölte, P. Arora, and R. H\"ab-Umbach. Typicality and emotion in the voice of children with autism spectrum condition: Evidence across three languages. In Proceedings of INTERSPEECH, pages 115--119, Dresden, Germany, 2015. ISCA.
[19]
E. Marchi, Y. Zhang, F. Eyben, F. Ringeval, and B. Schuller. Autism and speech, language, and emotion -- a survey. In H. Patil and M. Kulshreshtha, editors, Evaluating the role of speech technology in medical case management. De Gruyter, Berlin, Germany, 2015.
[20]
E. Mower, M. Black, E. Flores, M. Williams, and S. Narayanan. Rachel: Design of an emotionally targeted interactive agent for children with autism. In Proceedings IEEE International Conference on Multimedia and Expo, pages 1--6, Barcelona, Spain, 2011.
[21]
D. Oller, P. Niyogi, S. Gray, J. Richards, J. Gilkerson, D. Xu, U. Yapanel, and S. Warren. Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Proceedings of the National Academy of Sciences, 107(30):13354--13359, 2010.
[22]
S. Pascual, A. Bonafonte, and J. Serrà. SEGAN: Speech enhancement generative adversarial network. arXiv preprint arXiv:1703.09452, 2017.
[23]
A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015.
[24]
F. Ringeval, J. Demouy, G. Szaszák, M. Chetouani, L. Robel, J. Xavier, D. Cohen, and M. Plaza. Automatic intonation recognition for prosodic assessment of language impaired children. IEEE Transactions on Audio, Speech & Language Processing, 19(5):1328--1342, 2011.
[25]
F. Ringeval, E. Marchi, C. Grossard, J. Xavier, M. Chetouani, D. Cohen, and B. Schuller. Automatic analysis of typical and atypical encoding of spontaneous emotion in the voice of children. In Proceedings of INTERSPEECH, pages 1210--1214, San Francisco, CA, US, 2016. ISCA.
[26]
T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training GANs. In Proceedings Neural Information Processing Systems, pages 2226--2234, Barcelona, Spain, 2016.
[27]
M. Schmitt, C. Janott, V. Pandit, K. Qian, C. Heiser, W. Hemmert, and B. Schuller. A bag-of-audio-words approach for snore sounds' excitation localisation. In Proceedings 14th ITG Conference on Speech Communication, volume 267 of ITG-Fachbericht, pages 230--234, Paderborn, Germany, 2016. ITG/VDE, IEEE/VDE.
[28]
M. Schmitt, E. Marchi, F. Ringeval, and B. Schuller. Towards cross-lingual automatic diagnosis of autism spectrum condition in children's voices. In Proceedings 14th ITG Conference on Speech Communication, volume 267 of ITG-Fachbericht, pages 264--268, Paderborn, Germany, 2016. ITG/VDE, IEEE/VDE.
[29]
M. Schmitt, F. Ringeval, and B. Schuller. At the border of acoustics and linguistics: Bag-of-audio-words for the recognition of emotions in speech. In Proceedings of INTERSPEECH, pages 495--499, San Francisco, CA, US, 2016. ISCA.
[30]
B. Schuller and F. Burkhardt. Learning with synthesized speech for automatic emotion recognition. In Proceedings 35th IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 5150--515, Dallas, TX, US, 2010.
[31]
B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, and S. Narayanan. Paralinguistics in speech and language -- state-of-the-art and the challenge. Computer Speech and Language, Special Issue on Paralinguistics in Naturalistic Speech and Language, 27(1):4--39, 2013.
[32]
B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, F. Weninger, F. Eyben, E. Marchi, H. Salamin, A. Polychroniou, F. Valente, and S. Kim. The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism. In Proceedings of INTERSPEECH, pages 148--152, Lyon, France, 2013. ISCA.
[33]
D. Serdyuk, K. Audhkhasi, P. Brakel, B. Ramabhadran, S. Thomas, and Y. Bengio. Invariant representations for noisy speech recognition. CoRR, abs/1612.01928, 2016.
[34]
B. Xu, N. Wang, T. Chen, and M. Li. Empirical evaluation of rectified activations in convolutional network. CoRR, abs/1505.00853, 2015.

Cited By

View all
  • (2023)An Engineering View on Emotions and Speech: From Analysis and Predictive Models to Responsible Human-Centered ApplicationsProceedings of the IEEE10.1109/JPROC.2023.3276209111:10(1142-1158)Online publication date: Oct-2023
  • (2022)Modeling Feature Representations for Affective Speech Using Generative Adversarial NetworksIEEE Transactions on Affective Computing10.1109/TAFFC.2020.299811813:2(1098-1110)Online publication date: 1-Apr-2022
  • (2022)Cross-domain image translation with a novel style-guided diversity loss designKnowledge-Based Systems10.1016/j.knosys.2022.109731255(109731)Online publication date: Nov-2022
  • Show More Cited By

Index Terms

  1. Speech-based Diagnosis of Autism Spectrum Condition by Generative Adversarial Network Representations

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      DH '17: Proceedings of the 2017 International Conference on Digital Health
      July 2017
      256 pages
      ISBN:9781450352499
      DOI:10.1145/3079452
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 July 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. autism spectrum condition
      2. automatic diagnosis
      3. generative adversarial networks
      4. representation learning

      Qualifiers

      • Short-paper

      Conference

      DH '17
      DH '17: International Conference on Digital Health
      July 2 - 5, 2017
      London, United Kingdom

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)17
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 03 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)An Engineering View on Emotions and Speech: From Analysis and Predictive Models to Responsible Human-Centered ApplicationsProceedings of the IEEE10.1109/JPROC.2023.3276209111:10(1142-1158)Online publication date: Oct-2023
      • (2022)Modeling Feature Representations for Affective Speech Using Generative Adversarial NetworksIEEE Transactions on Affective Computing10.1109/TAFFC.2020.299811813:2(1098-1110)Online publication date: 1-Apr-2022
      • (2022)Cross-domain image translation with a novel style-guided diversity loss designKnowledge-Based Systems10.1016/j.knosys.2022.109731255(109731)Online publication date: Nov-2022
      • (2022)Autism Detection Using Machine Learning Approach: A ReviewMachine Intelligence and Smart Systems10.1007/978-981-16-9650-3_14(179-197)Online publication date: 24-May-2022
      • (2022)Early Detection of Autism Spectrum Disorder (ASD) Using Machine Learning Techniques: A ReviewProceedings of Third International Conference on Communication, Computing and Electronics Systems10.1007/978-981-16-8862-1_66(1015-1027)Online publication date: 20-Mar-2022
      • (2022)Latest Advances in Computational Speech Analysis for Mobile SensingDigital Phenotyping and Mobile Sensing10.1007/978-3-030-98546-2_12(209-228)Online publication date: 23-Jul-2022
      • (2021)Combined Signal Processing Based Techniques and Feed Forward Neural Networks for Pathological Voice Detection and ClassificationSound&Vibration10.32604/sv.2021.01173455:2(141-161)Online publication date: 2021
      • (2021)Prosodic entrainment in individuals with autism spectrum disorderTopics in Linguistics10.2478/topling-2021-001022:2(47-61)Online publication date: 30-Dec-2021
      • (2021)Speech Technology for Healthcare: Opportunities, Challenges, and State of the ArtIEEE Reviews in Biomedical Engineering10.1109/RBME.2020.300686014(342-356)Online publication date: 2021
      • (2021)A Persian speaker-independent dataset to diagnose autism infected children based on speech processing techniques2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)10.1109/ICSPIS54653.2021.9729345(01-05)Online publication date: 29-Dec-2021
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media