Noise is everywhere and in most applications that are related to audio and speech, such as human-machine interfaces, hands-free communications, voice over IP (VoIP), hearing aids, teleconferencing/telepresence/telecollaboration systems, and so many others, the signal of interest (usually speech) that is picked up by a microphone is generally contaminated by noise. As a result, the microphone signal has to be cleaned up with digital signal processing tools before it is stored, analyzed, transmitted, or played out. This cleaning process is often called noise reduction and this topic has attracted a considerable amount of research and engineering attention for several decades. One of the objectives of this book is to present in a common framework an overview of the state of the art of noise reduction algorithms in the single-channel (one microphone) case. The focus is on the most useful approaches, i.e., filtering techniques (in different domains) and spectral enhancement methods. The other objective of Noise Reduction in Speech Processing is to derive all these well-known techniques in a rigorous way and prove many fundamental and intuitive results often taken for granted. This book is especially written for graduate students and research engineers who work on noise reduction for speech and audio applications and want to understand the subtle mechanisms behind each approach. Many new and interesting concepts are presented in this text that we hope the readers will find useful and inspiring.
Cited By
- Yuan D and Wang L Dual-Criterion Quality Loss for Blind Image Quality Assessment Proceedings of the 32nd ACM International Conference on Multimedia, (7823-7832)
- Jin J, Ding S, Wang W and Feng F Understanding and Counteracting Feature-Level Bias in Click-Through Rate Prediction Companion Proceedings of the ACM Web Conference 2024, (838-841)
- Han X, Zhou K, Wang T, Li J, Wang F and Zou N (2024). Marginal Nodes Matter: Towards Structure Fairness in Graphs, ACM SIGKDD Explorations Newsletter, 25:2, (4-13), Online publication date: 26-Mar-2024.
- Fu T, Wei C, Wang Y and Ying R DeSCo: Towards Generalizable and Scalable Deep Subgraph Counting Proceedings of the 17th ACM International Conference on Web Search and Data Mining, (218-227)
- Pan C, Chen J and Benesty J (2024). On intrusive speech quality measures and a global SNR based metric, Speech Communication, 158:C, Online publication date: 1-Mar-2024.
- Ghosheh G, Li J and Zhu T (2023). A Survey of Generative Adversarial Networks for Synthesizing Structured Electronic Health Records, ACM Computing Surveys, 0:0
- Wang S, Geng M, Lin B, Sun Z, Wen M, Liu Y, Li L, Bissyandé T and Mao X Natural Language to Code: How Far Are We? Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, (375-387)
- Sattarov T, Schreyer M and Borth D FinDiff: Diffusion Models for Financial Tabular Data Generation Proceedings of the Fourth ACM International Conference on AI in Finance, (64-72)
- Sun Y, Yuan Y, Yu Z, Kuper R, Song C, Huang J, Ji H, Agarwal S, Lou J, Jeong I, Wang R, Ahn J, Xu T and Kim N Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, (105-121)
- Wen J, Xiang S and Pan C Exploring Universal Principles for Graph Contrastive Learning: A Statistical Perspective Proceedings of the 31st ACM International Conference on Multimedia, (3579-3589)
- Dutta A, Alcaraz J, TehraniJamsaz A, Cesar E, Sikora A and Jannesari A Performance Optimization using Multimodal Modeling and Heterogeneous GNN Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, (45-57)
- Kong F, Li Y, Nassif H, Fiez T, Henao R and Chakrabarti S Neural Insights for Digital Marketing Content Design Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (4320-4332)
- Liu H, Han H, Jin W, Liu X and Liu H Enhancing Graph Representations Learning with Decorrelated Propagation Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (1466-1476)
- Guo Z, Chheang V, Li J, Barner K, Bhat A and Barmaki R Social Visual Behavior Analytics for Autism Therapy of Children Based on Automated Mutual Gaze Detection Proceedings of the 8th ACM/IEEE International Conference on Connected Health: Applications, Systems and Engineering Technologies, (11-21)
- Zhao S, Sahebi S and Feyzi Behnagh R Curb Your Procrastination: A Study of Academic Procrastination Behaviors vs. A Planning and Time Management App Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, (124-134)
- Vijay H, Pushp S, Mittal A, Gupta P, Gupta M, Gambhira S, Chopra S, Baranwal M, Arya A, Manchepalli A and Padmanabhan V (2023). HyWay, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 7:2, (1-33), Online publication date: 12-Jun-2023.
- Liu C, Xia X, Lo D, Gao C, Yang X and Grundy J (2021). Opportunities and Challenges in Code Search Tools, ACM Computing Surveys, 54:9, (1-40), Online publication date: 31-Dec-2022.
- Feng Z, Chatterjee A, Sarma A and Ahmed I A case study of implicit mentoring, its prevalence, and impact in Apache Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, (797-809)
- Lee H, Kim J and Woo S Sliding Cross Entropy for Self-Knowledge Distillation Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (1044-1053)
- Sun Y, Tang S, Wang C and Hsu C On Objective and Subjective Quality of 6DoF Synthesized Live Immersive Videos Proceedings of the 2nd Workshop on Quality of Experience in Visual Multimedia Applications, (49-56)
- Wang Y, Gu T, Zhang Y, Lyu M, Luan T and Li H Enabling secure touch-to-access device pairing based on human body's electrical response Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, (556-569)
- Menezes G, Braga W, Fontão A, Hora A and Cafeo B Assessing the Impact of Code Samples Evolution on Developers’ Questions Proceedings of the XXXVI Brazilian Symposium on Software Engineering, (321-330)
- Wu Y, Hassan M and Hu W (2022). SafeGait, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6:2, (1-27), Online publication date: 4-Jul-2022.
- Rashid N, Demirel B, Odema M and Al Faruque M (2022). Template Matching Based Early Exit CNN for Energy-efficient Myocardial Infarction Detection on Low-power Wearable Devices, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6:2, (1-22), Online publication date: 4-Jul-2022.
- Wu C, Li X, Luo L and Zeng Q G2Auth Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services, (84-98)
- Wu M, Jiang L, Xiang J, Zhang Y, Yang G, Ma H, Nie S, Wu S, Cui H and Zhang L Evaluating and improving neural program-smoothing-based fuzzing Proceedings of the 44th International Conference on Software Engineering, (847-858)
- Haque M, Yadlapalli Y, Yang W and Liu C EREBA Proceedings of the 44th International Conference on Software Engineering, (835-846)
- Tërnava X, Lesoil L, Randrianaina G, Khelladi D and Acher M On the Interaction of Feature Toggles Proceedings of the 16th International Working Conference on Variability Modelling of Software-Intensive Systems, (1-5)
- Bao Q, Chen J, Liu L, Liu J, Liang J and Xiao Y Harvesting More Answer Spans from Paragraph beyond Annotation Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, (27-36)
- Agrawal M, Sharma S, Nanavati A and Mukherjea S Study of Equities using Signed Social Network Proceedings of the 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD), (231-235)
- Wu Q, Brinton C, Zhang Z, Pizzoferrato A, Liu Z and Cucuringu M Equity2Vec Proceedings of the Second ACM International Conference on AI in Finance, (1-9)
- Alves T, da Hora Rodrigues K and Ponti M Interactive protocol for acquisition of migraine diaries with a mobile app and machine learning data analysis Proceedings of the XX Brazilian Symposium on Human Factors in Computing Systems, (1-9)
- Chien W, Chou H and Lee C Belongingness and Satisfaction Recognition from Physiological Synchrony with A Group-Modulated Attentive BLSTM under Small-group Conversation Companion Publication of the 2021 International Conference on Multimodal Interaction, (220-229)
- Zhang X, Xu Y, Qin S, He S, Qiao B, Li Z, Zhang H, Li X, Dang Y, Lin Q, Chintalapati M, Rajmohan S and Zhang D Onion: identifying incident-indicating logs for cloud systems Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, (1253-1263)
- Pandey A and Caliskan A Disparate Impact of Artificial Intelligence Bias in Ridehailing Economy's Price Discrimination Algorithms Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, (822-833)
- Zhang L, Li K and Gu S Empirical study of correlations in the fitness landscapes of combinatorial optimization problems Proceedings of the Genetic and Evolutionary Computation Conference Companion, (247-248)
- Huang Y, Hu H and Chen C Robustness of on-device models Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice, (101-110)
- Massacci F and Pashchenko I Technical Leverage in a Software Ecosystem Proceedings of the 43rd International Conference on Software Engineering, (1386-1397)
- Mendoza D, Romero F, Li Q, Yadwadkar N and Kozyrakis C Interference-Aware Scheduling for Inference Serving Proceedings of the 1st Workshop on Machine Learning and Systems, (80-88)
- Zhao M, Wu H, Niu D, Wang Z and Wang X Verdi: Quality Estimation and Error Detection for Bilingual Corpora Proceedings of the Web Conference 2021, (3023-3031)
- Liu Z, Kettimuthu R, Chung J, Ananthakrishnan R, Link M and Foster I (2021). Design and Evaluation of a Simple Data Interface for Efficient Data Transfer across Diverse Storage, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 6:1, (1-25), Online publication date: 31-Mar-2021.
- Fang Z, Wang G, Xie X, Zhang F and Zhang D (2021). Urban Map Inference by Pervasive Vehicular Sensing Systems with Complementary Mobility, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5:1, (1-24), Online publication date: 19-Mar-2021.
- Wang X, Chen J, Chen X, Guo J and Xiang Q (2021). Multichannel Iterative Noise Reduction Filters in the Short-Time-Fourier-Transform Domain Based on Kronecker Product Decomposition, IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, (2725-2740), Online publication date: 1-Jan-2021.
- Kao H, Yan S, Hosseinmardi H, Narayanan S, Lerman K and Ferrara E (2020). User-Based Collaborative Filtering Mobile Health System, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4:4, (1-17), Online publication date: 17-Dec-2020.
- Neto I and Claro D Classifying Web page features to detect DaaS Proceedings of the Brazilian Symposium on Multimedia and the Web, (325-331)
- Spreafico A and Carenini G Neural Data-Driven Captioning of Time-Series Line Charts Proceedings of the 2020 International Conference on Advanced Visual Interfaces, (1-5)
- Kalatzis A, Stanley L, Karthikeyan R and Mehta R Mental stress classification during a motor task in older adults using an artificial neural network Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers, (244-248)
- Siraj M, Faisal M, Shahid O, Abir F, Hossain T, Inoue S and Ahad M UPIC Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers, (340-345)
- Faisal M, Siraj M, Abdullah M, Shahid O, Abir F and Ahad M A pragmatic signal processing approach for nurse care activity recognition using classical machine learning Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers, (396-401)
- Gao N, Shao W, Rahaman M and Salim F (2020). n-Gage, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4:3, (1-26), Online publication date: 4-Sep-2020.
- Vijayan A, Tahoori M and Chakrabarty K (2020). Runtime Identification of Hardware Trojans by Feature Analysis on Gate-Level Unstructured Data and Anomaly Detection, ACM Transactions on Design Automation of Electronic Systems, 25:4, (1-23), Online publication date: 2-Sep-2020.
- Cheng D, Yang F, Wang X, Zhang Y and Zhang L Knowledge Graph-based Event Embedding Framework for Financial Quantitative Investments Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, (2221-2230)
- Yuan Z, Liu H, Liu Y, Zhang D, Yi F, Zhu N and Xiong H Spatio-Temporal Dual Graph Attention Network for Query-POI Matching Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, (629-638)
- Kim S, Sim A, Wu K, Byna S, Son Y and Eom H Towards HPC I/O Performance Prediction through Large-scale Log Analysis Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing, (77-88)
- Debnath A, Pinnaparaju N, Shrivastava M, Varma V and Augenstein I Semantic Textual Similarity of Sentences with Emojis Companion Proceedings of the Web Conference 2020, (426-430)
- Itzhak G, Benesty J and Cohen I (2020). Quadratic approach for single-channel noise reduction, EURASIP Journal on Audio, Speech, and Music Processing, 2020:1, Online publication date: 15-Apr-2020.
- Ferreira M, Rolim V, Mello R, Lins R, Chen G and Gašević D Towards automatic content analysis of social presence in transcripts of online discussions Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, (141-150)
- Wu Y, Lin Q, Jia H, Hassan M and Hu W (2020). Auto-Key, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4:1, (1-23), Online publication date: 18-Mar-2020.
- Nguyen M, Nakajima T, Yoshimi M and Thoai N Analyzing and Predicting the Popularity of Online Contents Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services, (93-102)
- Anton S, Lohfink A, Garth C and Schotten H Security in Process Proceedings of the Third Central European Cybersecurity Conference, (1-6)
- Chinthavali S, Tansakul V, Lee S, Tabassum A, Munk J, Jakowski J, Starke M, Kuruganti T, Buckberry H and Leverette J Quantification of Energy Cost Savings through Optimization and Control of Appliances within Smart Neighborhood Homes Proceedings of the 1st ACM International Workshop on Urban Building Energy Sensing, Controls, Big Data Analysis, and Visualization, (59-68)
- Shahid A, Pissinou N, Njilla L, Alemany S, Imteaj A, Makki K and Aguilar E Quantifying location privacy in permissioned blockchain-based internet of things (IoT) Proceedings of the 16th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, (116-125)
- Xu H, Wang D, Zhao R and Zhang Q FaHo Proceedings of the 17th Conference on Embedded Networked Sensor Systems, (351-363)
- Ferreira R, Pereira A, Camargos O and Brandão M Data science in financial markets Proceedings of the 25th Brazillian Symposium on Multimedia and the Web, (393-400)
- Wang Y, Li Z, Xu J, Yu P and Ma X Fast Robustness Prediction for Deep Neural Network Proceedings of the 11th Asia-Pacific Symposium on Internetware, (1-10)
- Huang T, Zhou C, Zhang R, Wu C, Yao X and Sun L Comyco: Quality-Aware Adaptive Video Streaming via Imitation Learning Proceedings of the 27th ACM International Conference on Multimedia, (429-437)
- Li Z, Yang D, Zhao L, Bian J, Qin T and Liu T Individualized Indicator for All Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (894-902)
- Park S, Kim D and Son S An Empirical Study of Prioritizing JavaScript Engine Crashes via Machine Learning Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, (646-657)
- Rausch B, Staudt P and Weinhardt C Transmission Grid Congestion Data and Directions for Future Research Proceedings of the Tenth ACM International Conference on Future Energy Systems, (443-446)
- Kammi S (2019). Single channel speech enhancement using an MVDR filter in the frequency domain, International Journal of Speech Technology, 22:2, (383-389), Online publication date: 1-Jun-2019.
- Ouyang Y, Guo B, Guo T, Cao L and Yu Z (2018). Modeling and Forecasting the Popularity Evolution of Mobile Apps, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2:4, (1-23), Online publication date: 27-Dec-2018.
- Ramirez-Quintana J, Chacon-Murguia M and Ramirez-Alonso G (2018). Adaptive background modeling of complex scenarios based on pixel level learning modeled with a retinotopic self-organizing map and radial basis mapping, Applied Intelligence, 48:12, (4976-4997), Online publication date: 1-Dec-2018.
- Esteves S, Galhardas H and Veiga L Adaptive Execution of Continuous and Data-intensive Workflows with Machine Learning Proceedings of the 19th International Middleware Conference, (239-252)
- Kleinerman A, Rosenfeld A and Kraus S Providing explanations for recommendations in reciprocal environments Proceedings of the 12th ACM Conference on Recommender Systems, (22-30)
- D'Silva K, Jayarajah K, Noulas A, Mascolo C and Misra A (2018). The Role of Urban Mobility in Retail Business Survival, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2:3, (1-22), Online publication date: 18-Sep-2018.
- Vougioukas M, Androutsopoulos I and Paliouras G Identifying Retweetable Tweets with a Personalized Global Classifier Proceedings of the 10th Hellenic Conference on Artificial Intelligence, (1-8)
- Li X, Shih P and Daniel Y Effects of Intuition and Sensing in Programming Performance using MBTI personality model Proceedings of the 2nd International Conference on Advances in Image Processing, (189-193)
- Zhang M, Dai Q, Yang P, Xiong J, Tian C and Xiang C (2018). iDial, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2:1, (1-20), Online publication date: 26-Mar-2018.
- Ben Aicha A (2017). Noise estimation for speech enhancement algorithms with post-smoothness processor incorporating global posterior SNR, Multimedia Tools and Applications, 76:22, (23661-23678), Online publication date: 1-Nov-2017.
- Abani N, Braun T and Gerla M Proactive caching with mobility prediction under uncertainty in information-centric networks Proceedings of the 4th ACM Conference on Information-Centric Networking, (88-97)
- Prathosh A, Praveena P, Mestha L and Bharadwaj S (2017). Estimation of Respiratory Pattern From Video Using Selective Ensemble Aggregation, IEEE Transactions on Signal Processing, 65:11, (2902-2916), Online publication date: 1-Jun-2017.
- Huang Z, Liu Q, Chen E, Zhao H, Gao M, Wei S, Su Y and Hu G Question difficulty prediction for READING problems in standard tests Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, (1352-1359)
- Nørholm S, Jensen J and Christensen M (2016). Enhancement and noise statistics estimation for non-stationary voiced speech, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:4, (645-658), Online publication date: 1-Apr-2016.
- Jensen J, Benesty J and Christensen M (2016). Noise reduction with optimal variable span linear filters, IEEE/ACM Transactions on Audio, Speech and Language Processing, 24:4, (631-644), Online publication date: 1-Apr-2016.
- Fernandes G, Carvalho L, Rodrigues J and Proença M (2016). Network anomaly detection using IP flows with Principal Component Analysis and Ant Colony Optimization, Journal of Network and Computer Applications, 64:C, (1-11), Online publication date: 1-Apr-2016.
- Huang G, Benesty J, Long T and Chen J (2014). A family of maximum SNR filters for noise reduction, IEEE/ACM Transactions on Audio, Speech and Language Processing, 22:12, (2034-2047), Online publication date: 1-Dec-2014.
- Lacouture-Parodi Y, Habets E, Chen J and Benesty J (2014). Multichannel noise reduction in the Karhunen-Loève expansion domain, IEEE/ACM Transactions on Audio, Speech and Language Processing, 22:5, (923-936), Online publication date: 1-May-2014.
- Szurley J, Bertrand A, Ruckebusch P, Moerman I and Moonen M (2014). Greedy distributed node selection for node-specific signal estimation in wireless sensor networks, Signal Processing, 94, (57-73), Online publication date: 1-Jan-2014.
- McMahan W and Kuchenbecker K Spectral subtraction of robot motion noise for improved event detection in tactile acceleration signals Proceedings of the 2012 international conference on Haptics: perception, devices, mobility, and communication - Volume Part I, (326-337)
- Aicha A and Jebara S Effects of intra-frame noise smoothing on speech enhancement algorithms Proceedings of the 5th international conference on Advances in nonlinear speech processing, (146-153)
- Ribas D, Villalba J, Lleida E and Calvo J Speaker verification in noisy environment using missing feature approach Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications, (220-227)
- Jensen J, Benesty J and Christensen M Variable span filters for speech enhancement 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (6505-6509)
- Wang X, Benesty J and Chen J A single-channel noise cancelation filter in the short-time-fourier-transform domain 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (5235-5239)
- Christensen K, Christensen M, Boldt J and Gran F Experimental study of generalized subspace filters for the cocktail party situation 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (420-424)
Recommendations
Estimation of subband speech correlations for noise reduction via MVDR Processing
Recently, it has been proposed to use the minimum-variance distortionless-response (MVDR) approach in single-channel speech enhancement in the short-time frequency domain. By applying optimal FIR filters to each subband signal, these filters reduce ...
Theoretical Analysis of Binaural Multimicrophone Noise Reduction Techniques
Binaural hearing aids use microphone signals from both left and right hearing aid to generate an output signal for each ear. The microphone signals can be processed by a procedure based on speech distortion weighted multichannel Wiener filtering (SDW-...
Simultaneous optimization of acoustic echo reduction, speech dereverberation, and noise reduction against mutual interference
We propose an optimized speech enhancement method that combines acoustic echo reduction, speech dereverberation, and noise reduction in a unified framework. Normally, partial optimization of acoustic echo reduction, speech dereverberation, and noise ...