[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Audio Events Detection in Noisy Embedded Railway Environments

  • Conference paper
  • First Online:
Dependable Computing - EDCC 2020 Workshops (EDCC 2020)

Abstract

Ensuring passengers’ safety is one of the daily concerns of railway operators. To do this, various image and sound processing techniques have been proposed in the scientific community. Since the beginning of the 2010s, the development of deep learning made it possible to develop these research areas in the railway field included. Thus, this article deals with the audio events detection task (screams, glass breaks, gunshots, sprays) using deep learning techniques. It describes the methodology for designing a deep learning architecture that is both suitable for audio detection and optimised for embedded railway systems. We will describe how we designed from scratch two CRNN (Convolutional Recurrent Neural Network) for the detection task. And since the creation of a large and varied training database is one of the challenges of deep learning, this article also deals with the innovative methodology used to build a database of audio events in the railway environment. Finally, we will show the very promising results obtained during the experimentation in real of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. http://dcase.community/challenge2019/

  2. Abeßer, J.: A review of deep learning based methods for acoustic scene classification. Appl. Sci. 10(6) (2020)

    Google Scholar 

  3. Adavanne, S., Pertilä, P., Virtanen, T.: Sound event detection using spatial features and convolutional recurrent neural network. In: IEEE International Conference on Acoustics, Speech and Signal Process, New Orleans, LA, USA, 5–9 March 2017, pp. 771–775 (2017)

    Google Scholar 

  4. Adavanne, S., Politis, A., Nikunen, J., Virtanen, T.: Sound event localization and detection of overlapping sources using convolutional recurrent neural networks. IEEE J. Sel. Top. Signal Process. 13(1), 34–48 (2019)

    Article  Google Scholar 

  5. Adavanne, S., Parascandolo, G., Pertilä, P., Heittola, T., Virtanen, T.: Sound event detection in multichannel audio using spatial and harmonic features. In: Detection and Classification of Acoustic Scenes and Events Workshop, Budapest, Hungary, 3 September 2016

    Google Scholar 

  6. Cakir, E., Parascandolo, G., Heittola, T., Huttunen, H., Virtanen, T.: Convolutional recurrent neural networks for polyphonic sound event detection. EEE/ACM Trans. Audio Speech Lang. Process. 25(6), 1291–1303 (2017)

    Article  Google Scholar 

  7. Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches. In: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, Doha, Qatar, 25 October 2014, pp. 103–111. Association for Computational Linguistics (2014)

    Google Scholar 

  8. Drossos, K., Magron, P., Virtanen, T.: Unsupervised adversarial domain adaptation based on the Wasserstein distance for acoustic scene classification. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, 20–23 October 2019, pp. 259–263 (2019)

    Google Scholar 

  9. Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: a system for detecting anomalous sounds. IEEE Trans. Intell. Transp. Syst. 17(1), 279–288 (2016)

    Article  Google Scholar 

  10. Font, F., Roma, G., Serra, X.: Freesound technical demo. In: ACM International Conference on Multimedia, Barcelona, Spain, 21 October 2013, pp. 411–412 (2013)

    Google Scholar 

  11. Huzaifah, M.: Comparison of time-frequency representations for environmental sound classification using convolutional neural networks. CoRR abs/1706.07156 (2017)

    Google Scholar 

  12. Laffitte, P., Sodoyer, D., Tatkeu, C., Girin, L.: Deep neural networks for automatic detection of screams and shouted speech in subway trains. In: IEEE International Conference on Acoustics, Speech and Signal Process, Shanghai, China, 20–25 March 2016, pp. 6460–6464 (2016)

    Google Scholar 

  13. Laffitte, P., Wang, Y., Sodoyer, D., Girin, L.: Assessing the performances of different neural network architectures for the detection of screams and shouts in public transportation. Expert. Syst. Appl. 117, 29–41 (2019)

    Article  Google Scholar 

  14. Lim, H., Park, J., Lee, K., Han, Y.: Rare sound event detection using 1D convolutional recurrent neural networks. In: Detection and Classification of Acoustic Scenes and Events Workshop, Munich, Germany, 16 November 2017

    Google Scholar 

  15. Mesaros, A., Heittola, T., Virtanen, T.: Metrics for polyphonic sound event detection. Appl. Sci. 6(6), 162 (2016)

    Article  Google Scholar 

  16. Pham, Q.C., et al.: Audio-video surveillance system for public transportation. In: 2nd International Conference on Image Processing Theory, Tools and Applications, Paris, France, 7–10 July 2010. https://doi.org/10.1109/ipta.2010.5586783

  17. Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S., Sainath, T.: Deep learning for audio signal processing. IEEE J. Sel. Top. Signal Process. 13(2), 206–219 (2019)

    Article  Google Scholar 

  18. Ravanelli, M., Bengio, Y.: Speaker recognition from raw waveform with SincNet. In: IEEE Spoken Language Technology Workshop, Athens, Greece, 18–21 December 2018, pp. 1021–1028 (2018)

    Google Scholar 

  19. Salamon, J., Bello, J.P., Farnsworth, A., Kelling, S.: Fusing shallow and deep learning for bioacoustic bird species classification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, New Orleans, LA, USA, 5–9 March 2017, pp. 141–145 (2017)

    Google Scholar 

  20. Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.A.: Striving for simplicity: the all convolutional net. In: 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015

    Google Scholar 

  21. Turpault, N., Serizel, R., Salamon, J., Shah, A.P.: Sound event detection in domestic environments with weakly labeled data and soundscape synthesis. In: Detection and Classification of Acoustic Scenes and Events Workshop, New York University, NY, USA, October 2019, pp. 253–257 (2019)

    Google Scholar 

  22. Virtanen, T., Plumbley, M.D., Ellis, D. (eds.): Computational Analysis of Sound Scenes and Events, 1st edn. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-63450-0

    Book  Google Scholar 

  23. Xie, Y., Liang, R., Liang, Z., Huang, C., Zou, C., Schuller, B.: Speech emotion classification using attention-based LSTM. IEEE/ACM Trans. Audio Speech Lang. Process. 27(11), 1675–1685 (2019)

    Article  Google Scholar 

  24. Zhang, Z., Coutinho, E., Deng, J., Schuller, B.: Cooperative learning and its application to emotion recognition from speech. IEEE/ACM Trans. Audio Speech Lang. Process. 23(1), 115–126 (2015)

    Google Scholar 

  25. Zouaoui, R., et al.: Embedded security system for multi-modal surveillance in a railway carriage. In: Optics and Photonics for Counterterrorism, Crime Fighting, and Defence XI and Optical Materials and Biomaterials in Security and Defence Systems Technology XII. SPIE, Toulouse, France, 21 October 2015

    Google Scholar 

Download references

Acknowledgement

We would like to thank Helmi REBAI and Martin OLIVIER for strongly contributing to the advancement of this study.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tony Marteau or Sitou Afanou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Marteau, T., Afanou, S., Sodoyer, D., Ambellouis, S., Boukour, F. (2020). Audio Events Detection in Noisy Embedded Railway Environments. In: Bernardi, S., et al. Dependable Computing - EDCC 2020 Workshops. EDCC 2020. Communications in Computer and Information Science, vol 1279. Springer, Cham. https://doi.org/10.1007/978-3-030-58462-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58462-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58461-0

  • Online ISBN: 978-3-030-58462-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics