Marin-Hurtado et al., 2011 - Google Patents

FFT-based block processing in speech enhancement: potential artifacts and solutions

Marin-Hurtado et al., 2011

Document ID: 16507009525918145174
Author: Marin-Hurtado J; Anderson D
Publication year: 2011
Publication venue: IEEE transactions on audio, speech, and language processing

External Links

Cited by

Snippet

Most speech enhancement applications perform frequency shaping by means of multiplication in the frequency domain. Operating in the frequency domain is equivalent to convolution in the time domain. In these speech enhancement algorithms, the updating of …

Continue reading at www.researchgate.net (PDF) (other versions)

230000004044 response 0 abstract description 135

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
- G10L19/265—Pre-filtering, e.g. high frequency emphasis prior to encoding

Similar Documents

Publication	Publication Date	Title
Mowlaee et al.	2015	Harmonic phase estimation in single-channel speech enhancement using phase decomposition and SNR information
CA2732723C (en)	2016-10-11	Apparatus and method for processing an audio signal for speech enhancement using a feature extraction
CA2792449C (en)	2017-12-05	Device and method for improved magnitude response and temporal alignment in a phase vocoder based bandwidth extension method for audio signals
CA3200142C (en)	2024-02-20	Improved subband block based harmonic transposition
CN101976566A (en)	2011-02-16	Voice enhancement method and device using same
JP6544234B2 (en)	2019-07-17	Signal processing apparatus, signal processing method and signal processing program
Schröter et al.	2022	Low latency speech enhancement for hearing aids using deep filtering
Islam et al.	2014	Speech enhancement based on a modified spectral subtraction method
Esquef et al.	2006	An efficient model-based multirate method for reconstruction of audio signals across long gaps
Kim et al.	2023	HD-DEMUCS: General speech restoration with heterogeneous decoders
Fattah et al.	2011	Identification of autoregressive moving average systems based on noise compensation in the correlation domain
JP2009223210A (en)	2009-10-01	Signal band spreading device and signal band spreading method
CN108806721B (en)	2023-08-29	signal processor
Marin-Hurtado et al.	2011	FFT-based block processing in speech enhancement: potential artifacts and solutions
Hammam et al.	2021	Blind signal separation with noise reduction for efficient speaker identification
Thiem et al.	2020	Reducing artifacts in gan audio synthesis
Funaki	2008	Speech enhancement based on iterative wiener filter using complex speech analysis
Mallidi et al.	2013	Robust speaker recognition using spectro-temporal autoregressive models.
Upadhyay et al.	2012	Single channel speech enhancement utilizing iterative processing of multi-band spectral subtraction algorithm
Gui et al.	2005	Adaptive subband Wiener filtering for speech enhancement using critical-band gammatone filterbank
KR20230101829A (en)	2023-07-06	Apparatus for providing a processed audio signal, method for providing a processed audio signal, apparatus for providing neural network parameters, and method for providing neural network parameters
CN102568491A (en)	2012-07-11	Noise suppression method and equipment
Upadhyay et al.	2014	A perceptually motivated stationary wavelet packet filterbank using improved spectral over-subtraction for enhancement of speech in various noise environments
Nasir et al.	2025	A Hybrid Method for Speech Noise Reduction Using Log-MMSE
JP2002244695A (en)	2002-08-30	Speech spectrum improvement method, speech spectrum improvement device, speech spectrum improvement program, and storage medium storing program