[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Towards a Dialect Classification in German Speech Samples

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11658))

Included in the following conference series:

Abstract

The automatic classification of a speaker’s dialect can enrich many applications, e.g. in the human-machine interaction (HMI) or natural language processing (NLP) but also in specific areas such as pronunciation tutoring, forensic analysis or personalization of call-center talks. Although a lot of HMI/NLP-related research has been dedicated to different tasks in affective computing, emotion recognition, semantic understanding and other advanced topics, there seems to be a lack of methods for an automated dialect analysis that is not based on transcriptions, in particular for some languages like German. For other languages such as English, Mandarin or Arabic, a multitude of feature combinations and classification methods has been tried already, which provides a starting point for our study. We describe selected experiments to train suitable classifiers on German dialect varieties in the corpus “Regional Variants of German 1” (RVG1). Our article starts with a systematic choice of appropriate spectral features. In a second step, these features are post-processed with different methods and used to train one Gaussian Mixture Model (GMM) per feature combination as a Universal Background Model (UBM). The resulting UBMs are then adapted to a varied selection of dialects by maximum-a-posteriori (MAP) adaptation. Our preliminary results on German show, that a dialect discrimination and classification is possible. The unweighted recognition accuracy ranges from 32.4 to 54.9% in a 3-dialects test and from 19.6 to 31.4% in a classification of 9-dialects. Some dialects are easier distinguishable, purely using spectral features, while others require a different feature set or more sophisticated classification methods, which we will explore in future experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hanani, A., Russell, M.J., Carey, M.J.: Human and computer recognition of regional accents and ethnic groups from British English speech. Comput. Speech Lang. 27, 59–74 (2013). https://doi.org/10.1016/j.csl.2012.01.003

    Article  Google Scholar 

  2. Najafian, M., Khurana, S., Shon, S., Ali, A., Glass, J.R.: Exploiting convolutional neural networks for phonotactic based dialect identification. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, 15–20 April 2018, pp. 5174–5178 (2018). https://doi.org/10.1109/ICASSP.2018.8461486

  3. Wang, H., van Heuven, V.J.: Relative contribution of vowel quality and duration to native language identification in foreign-accented English. In: Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, ICCSP 2018, Guiyang, China, 16–19 March 2018, pp. 16–20 (2018). https://doi.org/10.1145/3199478.3199507

  4. Brown, G.: Automatic accent recognition systems and the effects of data on performance. In: Odyssey 2016: The Speaker and Language Recognition Workshop, Bilbao, Spain, 21–24 June 2016, pp. 94–100 (2016). https://doi.org/10.21437/Odyssey.2016-14

  5. Bougrine, S., Cherroun, H., Ziadi, D.: Hierarchical classification for spoken Arabic dialect identification using prosody: Case of Algerian dialects. CoRR abs/1703.10065 (2017). http://arxiv.org/abs/1703.10065

  6. Biadsy, F., Hirschberg, J., Habash, N.: Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the Workshop on Computational Approaches to Semitic Languages, SEMITIC@EACL 2009, Athens, Greece, 31 March 2009, pp. 53–61 (2009). https://aclanthology.info/papers/W09-0807/w09-0807

  7. Akbacak, M., Vergyri, D., Stolcke, A., Scheffer, N., Mandal, A.: Effective Arabic dialect classification using diverse phonotactic models. In: INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, 27–31 August 2011, pp. 737–740 (2011). http://www.isca-speech.org/archive/interspeech_2011/i11_0737.html

  8. Zheng, Y., et al.: Accent detection and speech recognition for Shanghai-accented Mandarin. In: INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, 4–8 September 2005, pp. 217–220 (2005). http://www.isca-speech.org/archive/interspeech_2005/i05_0217.html

  9. Hou, J., Liu, Y., Zheng, T.F., Olsen, J.Ø., Tian, J.: Multi-layered features with SVM for Chinese accent identification. In: 2010 International Conference on Audio, Language and Image Processing, pp. 25–30 (2010). https://doi.org/10.1109/ICALIP.2010.5685023

  10. Lei, Y., Hansen, J.H.L.: Dialect classification via text-independent training and testing for Arabic, Spanish, and Chinese. IEEE Trans. Audio Speech Lang. Process. 19, 85–96 (2011). https://doi.org/10.1109/TASL.2010.2045184

    Article  Google Scholar 

  11. Torres-Carrasquillo, P.A., Sturim, D.E., Reynolds, D.A., McCree, A.: Eigen-channel compensation and discriminatively trained Gaussian mixture models for dialect and accent recognition. In: INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, 22–26 September 2008, pp. 723–726 (2008). http://www.isca-speech.org/archive/interspeech_2008/i08_0723.html

  12. Biadsy, F., Hirschberg, J., Collins, M.: Dialect recognition using a phone-GMM-supervector-based SVM kernel. In: INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, 26–30 September 2010, pp. 753–756 (2010). http://www.isca-speech.org/archive/interspeech_2010/i10_0753.html

  13. Biadsy, F.: Automatic dialect and accent recognition and its application to speech recognition. Ph.D. thesis, Columbia University (2011). https://doi.org/10.7916/D8M61S68

  14. Zissman, M.A., Gleason, T.P., Rekart, D., Losiewicz, B.L.: Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, ICASSP ’96, Atlanta, Georgia, USA, 7–10 May 1996, pp. 777–780 (1996). https://doi.org/10.1109/ICASSP.1996.543236

  15. Chittaragi, N.B., Prakash, A., Koolagudi, S.: Dialect identification using spectral and prosodic features on single and ensemble classifiers. Arab. J. Sci. Eng. 43, 4289–4302 (2017). https://doi.org/10.1007/s13369-017-2941-0

    Article  Google Scholar 

  16. Najafian, M., Safavi, S., Weber, P., Russell, M.J.: Identification of British English regional accents using fusion of i-vector and multi-accent phonotactic systems. In: Odyssey 2016: The Speaker and Language Recognition Workshop, Bilbao, Spain, 21–24 June 2016, pp. 132–139 (2016). https://doi.org/10.21437/Odyssey.2016-19

  17. Zhang, Q., Boril, H., Hansen, J.H.L.: Supervector pre-processing for PRSVM-based Chinese and Arabic dialect identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, 26–31 May 2013, pp. 7363–7367 (2013). https://doi.org/10.1109/ICASSP.2013.6639093

  18. Liu, G., Hansen, J.H.L.: A systematic strategy for robust automatic dialect identification. In: Proceedings of the 19th European Signal Processing Conference, EUSIPCO 2011, Barcelona, Spain, 29 August–2 September 2011, pp. 2138–2141 (2011). http://ieeexplore.ieee.org/document/7074191/

  19. Lazaridis, A., el Khoury, E., Goldman, J., Avanzi, M., Marcel, S., Garner, P.N.: Swiss french regional accent identification. In: Odyssey 2014: The Speaker and Language Recognition Workshop, Joensuu, Finland, 16–19 June 2014 (2014). https://isca-speech.org/archive/odyssey_2014/abstracts.html#abs29

  20. Burger, S., Schiel, F.: RVG 1 - a database for regional variants of contemporary German. In: Proceedings of the 1st International Conference on Language Resources and Evaluation, pp. 1083–1087. Granada, Spain (1998). https://www.phonetik.uni-muenchen.de/forschung/publikationen/Burger-98-RVG1.ps

  21. Mettke, H.: Mittelhochdeutsche Grammatik. VEB Bibliographisches Institut, Leipzig, Germany (1989)

    Google Scholar 

  22. Larcher, A., Lee, K.A., Meignier, S.: An extensible speaker identification sidekit in Python. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, 20–25 March 2016, pp. 5095–5099 (2016). https://doi.org/10.1109/ICASSP.2016.7472648

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johanna Dobbriner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dobbriner, J., Jokisch, O. (2019). Towards a Dialect Classification in German Speech Samples. In: Salah, A., Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2019. Lecture Notes in Computer Science(), vol 11658. Springer, Cham. https://doi.org/10.1007/978-3-030-26061-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26061-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26060-6

  • Online ISBN: 978-3-030-26061-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics