Malek et al., 2017 - Google Patents
Speaker extraction using LCMV beamformer with DNN-based SPP and RTF identification schemeMalek et al., 2017
View PDF- Document ID
- 4482201501413881379
- Author
- Malek A
- Chazan S
- Malka I
- Tourbabin V
- Goldberger J
- Tzirkel-Hancock E
- Gannot S
- Publication year
- Publication venue
- 2017 25th European Signal Processing Conference (EUSIPCO)
External Links
Snippet
The linearly constrained minimum variance (LCMV)-beamformer (BF) is a viable solution for desired source extraction from a mixture of speakers in a noisy environment. The performance in terms of speech distortion, interference cancellation and noise reduction …
- 238000000605 extraction 0 title abstract description 6
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02163—Only one microphone
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Schwartz et al. | Online speech dereverberation using Kalman filter and EM algorithm | |
Souden et al. | A multichannel MMSE-based framework for speech source separation and noise reduction | |
Taseska et al. | Informed spatial filtering for sound extraction using distributed microphone arrays | |
Krueger et al. | Speech enhancement with a GSC-like structure employing eigenvector-based transfer function ratios estimation | |
Krueger et al. | Model-based feature enhancement for reverberant speech recognition | |
Schwartz et al. | An expectation-maximization algorithm for multimicrophone speech dereverberation and noise reduction with coherence matrix estimation | |
Koldovský et al. | Spatial source subtraction based on incomplete measurements of relative transfer function | |
Kumatani et al. | Beamforming with a maximum negentropy criterion | |
Martín-Doñas et al. | Dual-channel DNN-based speech enhancement for smartphones | |
Hoang et al. | Joint maximum likelihood estimation of power spectral densities and relative acoustic transfer functions for acoustic beamforming | |
EP3847645B1 (en) | Determining a room response of a desired source in a reverberant environment | |
Shankar et al. | Real-time dual-channel speech enhancement by VAD assisted MVDR beamformer for hearing aid applications using smartphone | |
Sadjadi et al. | Blind reverberation mitigation for robust speaker identification | |
Chazan et al. | DNN-based concurrent speakers detector and its application to speaker extraction with LCMV beamforming | |
Zohourian et al. | GSC-based binaural speaker separation preserving spatial cues | |
Malek et al. | Speaker extraction using LCMV beamformer with DNN-based SPP and RTF identification scheme | |
Hong et al. | Dual-microphone noise reduction in car environments with determinant analysis of input correlation matrix | |
Bavkar et al. | PCA based single channel speech enhancement method for highly noisy environment | |
Han et al. | Reverberation and noise robust feature compensation based on IMM | |
Šarić et al. | Supervised speech separation combined with adaptive beamforming | |
Pfeifenberger et al. | Eigenvector-Based Speech Mask Estimation Using Logistic Regression. | |
Pfeifenberger et al. | Blind source extraction based on a direction-dependent a-priori SNR. | |
Chazan et al. | LCMV beamformer with DNN-based multichannel concurrent speakers detector | |
Nakatani et al. | Simultaneous denoising, dereverberation, and source separation using a unified convolutional beamformer | |
Laufer et al. | ML estimation and CRBs for reverberation, speech, and noise PSDs in rank-deficient noise field |