DSP-based voice activity detection and background noise reduction

Charu Singh ORCID: orcid.org/0000-0003-0271-0594¹,
Maarten Venter²,
Rajesh Kumar Muthu¹ &
…
David Brown²

457 Accesses
2 Citations
Explore all metrics

Abstract

These days’ speech processing devices like voice-controlled devices, radio, and cell phones have gained more popularity in the area of military, audio forensics, speech recognition, education and health sectors. In the real world, speech signal during communication always contains background noise. The main task of speech related applications is voice activity detection (VAD) which include speech communication, speech recognition, and speech coding. Noise-reduction schemes for speech communication may increase the quality of speech and improve working efficiency in military aviation. Most of the developed algorithms can improve the quality of speech but unable to remove the background noise from the speech. This study provides researchers with a summary of the challenges in speech communication with background noise and provides research directions in the area of military personnel and workforces who work in noisy environments. Results of the study reveal that the DSP-based voice activity detection and background noise reduction algorithm reduced the spurious values of the speech signal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Ali, Z., & Talha, M. (2018). Innovative method for unsupervised voice activity detection and classification of audio segments, PF99. IEEE Access. https://doi.org/10.1109/ACCESS.2018.2805845.
Google Scholar
Bhooshan, S., Kumar, V., Verma, U., Vatsyayan, H., & Rohit, K. (2008). T-Law: A new suggestion for signal companding. In 2008 Congress on Image and Signal Processing (Vol. 3, pp. 190–194). https://doi.org/10.1109/CISP.2008.700.
Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., et al. (2018). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9, 1307. https://doi.org/10.1007/s13042-017-0645-0.
Article Google Scholar
Dey, N., & Ashour, A. S. (2018). Applied examples and applications of localization and tracking problem of multiple speech sources. In N. Dey, & A. S. Ashour (Eds.), Direction of arrival estimation and localization of multi-speech sources (pp. 35–48). Cham: Springer.
Chapter Google Scholar
Dey, N., Ashour, A. S., Mohamed, W. S., & Nguyen, N. G. (2019). Acoustic sensors in biomedical applications. In N. Dey, A. S. Ashour, W. S. Mohamed, & N. G. Nguyen (Eds.), Acoustic sensors for biomedical applications (pp. 43–47). Cham: Springer.
Chapter Google Scholar
Dey, N., Samanta, S., Yang, X.-S., Das, A., Chaudhuri, S. S. (2013). Optimisation of scaling factors in electrocardiogram signal watermarking using cuckoo search. International Journal of Bio-Inspired Computation, Inderscience Publishers, 5(5), 315–326.
Article Google Scholar
dsPIC DSC Noise Suppression Library User’s Guide (2004-2011). Microchip Technology Inc, DS70133E. Retrieved from http://ww1.microchip.com/downloads/en/ DeviceDoc/ DS-70133E.pdf.
dsPIC33F Family Data Sheet, High-Performance, 16-bit Digital Signal Controllers, Microchip Technical Literature. Retrieved February 15, 2018, from http://ww1.microchip.com/downloads/ en/DeviceDoc/70165d.pdf.
G.711 Speech Encoding/Decoding Library for 16-bit MCUs and DSCs User’s Guide, 2011 Microchip Technology. Retrieved February 15, 2018, from http://ww1.microchip.com/downloads/en//softwarelibrary/g.711%20speech%20encodingdecoding/70666a.pdf.
Gao, X., Cao, H., Zhang, J., & Bai, J. (2013). A real-time DSP-based system for voice activity detection: Design and implement. International Journal of Signal Processing, Image Processing, and Pattern Recognition, 6(6), 27–40. https://doi.org/10.14257/ijsip.2013.6.6.03.
Article Google Scholar
García, M., Patiño, D., & Quintana, R. (2015). DSP implementation of the FxLMS algorithm for active noise control: Texas instruments TSM320C6713DSK, 2015 IEEE 2nd Colombian Conference on Automatic Control (CCAC). https://doi.org/10.1109/CCAC.2015.7345188.
Graf, S., Herbig, T., Buck, M., Schmidt, G. (2016). Voice activity detection based on modulation-phase differences. In Proceedings of Speech Communication; 12. ITG Symposium. Retrieved from https://ieeexplore.ieee.org/document/7776151/.
Haykin, S., & Moher, M. (2007). Introduction to analog & digital communications (2nd ed., pp. 207–208). Hoboken: John Wiley and Sons, Inc.
Google Scholar
Jie, L., & Datao, Y. (2017). Enhanced speech based jointly statistical probability distribution function for voice activity detection. Chinese Journal of Electronics, IET, 26(2), 325–330. https://doi.org/10.1049/cje.2017.01.001.
Article Google Scholar
Khoa P. C. (2012). Noise robust voice activity detection, Master thesis, The Nanyang Technological University, 2012. Retrieved from https://pdfs.semanticscholar.org/fc3/27b8a7df7b99341637506d3f0eba4845d753.pdf.
Kim, G., & Loizou, C. (2010). Improving speech intelligibility in noise using environment-optimized algorithms. IEEE Transactions on Audio, Speech, and Language Processing, 18(8), 2080–2090. https://doi.org/10.1109/TASL.2010.2041116.
Article Google Scholar
Lahtinen, T. M., Huttunen, K. H., Kuronen, P. O., & Sorri, M., J. (2010). Radio speech communication problems reported in a survey of military pilots. Aviation, Space, and Environmental Medicine, 81(12), 1123–1127.
Article Google Scholar
Lezzoum, N., Gagnon, G., & Voix, J. (2014). Voice activity detection system for smart earphones. IEEE Transactions on Consumer Electronics, 60(4), 737–744. https://doi.org/10.1109/TCE.2014.7027350.
Article Google Scholar
Liang, J., Ahmad, M. O., & Swamy, M. N. S. (2005). Implementation of a voice activity detection and comfort noise generation Algorithm. In 48th Midwest Symposium on Circuits and Systems, Vol. 1, pp. 440–443. https://doi.org/10.1109/MWSCAS.2005.1594132.
MPLAB integrated development environment. Retrieved February 15, 2018, from: http://www.microchip.com/mplab/mplab-ide-home.
Mukherjee, H., Obaidullah, S. M., Santosh, K. C., et al. (2018). Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal. International Journal of Speech Technology. https://doi.org/10.1007/s10772-018-9525-6.
Google Scholar
New Microchip dsPIC33 Digital Signal Controller Family (2005). Retrieved from http://www.microcontroller.com/news/microchip_dsPIC33.asp.
Pasad, A., Sabu, K., & Rao, P.(2017). Voice Activity detection for children’s read speech recognition in noisy conditions. In 2017 Twenty-third National Conference on Communications (NCC), IEEE. https://doi.org/10.1109/NCC.2017.8077072.
Pearce,D., & Hirsch, H. (2000). The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy condition. In aICSLP 2000, 6th International Conference on Spoken Language Processing. Beijing, China, 16–20 October 2000.
Prell, C. G. L., & Clavier, O. H. (2017). Effects of noise on speech recognition: Challenges for communication by service members. Hearing Research, 349, (2017) 76–89. https://doi.org/10.1016/j.heares.2016.10.004.
Article Google Scholar
Price, M., Glass, J., & Chandrakasan, A. P. (2018). A low-power speech recognizer and voice activity detector using deep neural networks. IEEE Journal of Solid-state Circuits, 53(1), 66–75. https://doi.org/10.1109/JSSC.2017.2752838.
Article Google Scholar
Sat-Com (PTY) Ltd, Windhoek, Namibia, http://www.sat.com.na/.
Sehgal, A., & Kehtarnavaz, K. (2018). A Convolutional neural network smartphone app for real-time voice activity detection. IEEE Access. https://doi.org/10.1109/ACCESS.2018.2800728.
Google Scholar
Singh, R., Seltzer, M. L., Raj, B., & Stern, R. M. (2001). Speech in Noisy Environments: Robust automatic segmentation, feature extraction, and hypothesis combination. In February 2001 Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on 1, pp. 273–276. https://doi.org/10.1109/ICASSP.2001.940820.
Speech Coding Solutions User’s Guide, DS70295A. (2007). Microchip Technology Inc. Retrieved February 15, 2018, from http://ww1.microchip.com/ downloads/en/DeviceDoc/70295A.pdf, dsPIC® DSC.
Smith, S.W. (2018), The breadth and depth of DSP-the roots of DSP, The Scientist and Engineer’s Guide to Digital Signal Processing. Retrieved April 11, 2018, from http://www.dspguide.com/ch1/1.htm.
Vajda, S., & Santosh, K. C. (2017). A fast k-nearest neighbor classifier using unsupervised clustering. In K. Santosh, M. Hangarge, V. Bevilacqua & A. Negi (Eds.), Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016 (Vol. 709). Singapore: Springer.
Google Scholar
Yoo, I., Lim, H., & Yook, D. (2015). Formant-based robust voice activity detection. IEEE/ACM Transactions on audio, speech, and language Processing, 23(12), 2238–2245. https://doi.org/10.1109/TASLP.2015.2476762.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Vellore Institute of Technology, Vellore, India
Charu Singh & Rajesh Kumar Muthu
Sat-Com (PTY) Ltd, Windhoek, Namibia
Maarten Venter & David Brown

Authors

Charu Singh
View author publications
You can also search for this author in PubMed Google Scholar
Maarten Venter
View author publications
You can also search for this author in PubMed Google Scholar
Rajesh Kumar Muthu
View author publications
You can also search for this author in PubMed Google Scholar
David Brown
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charu Singh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, C., Venter, M., Muthu, R.K. et al. DSP-based voice activity detection and background noise reduction. Int J Speech Technol 21, 851–859 (2018). https://doi.org/10.1007/s10772-018-9556-z

Download citation

Received: 03 March 2018
Accepted: 16 September 2018
Published: 18 October 2018
Issue Date: 15 December 2018
DOI: https://doi.org/10.1007/s10772-018-9556-z

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Reference free speech quality estimation for diverse data condition

Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments

Recent Developments, Challenges, and Future Scope of Voice Activity Detection Schemes—A Review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

DSP-based voice activity detection and background noise reduction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Reference free speech quality estimation for diverse data condition

Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments

Recent Developments, Challenges, and Future Scope of Voice Activity Detection Schemes—A Review

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation