[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Morales-Cordovilla et al., 2014 - Google Patents

Distant speech recognition in reverberant noisy conditions employing a microphone array

Morales-Cordovilla et al., 2014

View PDF
Document ID
2021858435034491912
Author
Morales-Cordovilla J
Hagmüller M
Pessentheiner H
Kubin G
Publication year
Publication venue
2014 22nd European Signal Processing Conference (EUSIPCO)

External Links

Snippet

This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a microphone array. We present a prototype system that can segment the utterances in real-time and generate robust ASR results off-line. The segmentation is …
Continue reading at www.academia.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones

Similar Documents

Publication Publication Date Title
EP3707716B1 (en) Multi-channel speech separation
Vincent et al. The second ‘CHiME’speech separation and recognition challenge: An overview of challenge systems and outcomes
Kumatani et al. Channel selection based on multichannel cross-correlation coefficients for distant speech recognition
Hori et al. The MERL/SRI system for the 3rd CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition
Minhua et al. Frequency domain multi-channel acoustic modeling for distant speech recognition
Mitra et al. Evaluating robust features on deep neural networks for speech recognition in noisy and channel mismatched conditions
DEREVERBERATION et al. REVERB Workshop 2014
Matassoni et al. Hidden Markov model training with contaminated speech material for distant-talking speech recognition
Matassoni et al. The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones
KR20210137146A (en) Speech augmentation using clustering of queues
Bohlender et al. Neural networks using full-band and subband spatial features for mask based source separation
Gu et al. Rezero: Region-customizable sound extraction
Yamakawa et al. Environmental sound recognition for robot audition using matching-pursuit
US11528571B1 (en) Microphone occlusion detection
Morales-Cordovilla et al. Distant speech recognition in reverberant noisy conditions employing a microphone array
Varela et al. Combining pulse-based features for rejecting far-field speech in a HMM-based voice activity detector
Xiong et al. Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments
Yoshioka et al. Noise model transfer: Novel approach to robustness against nonstationary noise
Rodomagoulakis et al. Experiments on far-field multichannel speech processing in smart homes
Mitra et al. Deep convolutional nets and robust features for reverberation-robust speech recognition
Kindt et al. Improved separation of closely-spaced speakers by exploiting auxiliary direction of arrival information within a u-net architecture
Hu et al. Single-channel speaker diarization based on spatial features
Morales-Cordovilla et al. Room localization for distant speech recognition.
Morales-Cordovilla et al. A German distant speech recognizer based on 3D beamforming and harmonic missing data mask
Xue et al. A study on improving acoustic model for robust and far-field speech recognition