Barker et al., 2017 - Google Patents
The CHiME challenges: Robust speech recognition in everyday environmentsBarker et al., 2017
View PDF- Document ID
- 11392762392703662616
- Author
- Barker J
- Marxer R
- Vincent E
- Watanabe S
- Publication year
- Publication venue
- New era for robust speech recognition: Exploiting deep learning
External Links
Snippet
The CHiME challenge series has been aiming to advance the development of robust automatic speech recognition for use in everyday environments by encouraging research at the interface of signal processing and statistical modelling. The series has been running …
- 230000003203 everyday 0 title abstract description 7
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Haeb-Umbach et al. | Speech processing for digital home assistants: Combining signal processing with deep-learning techniques | |
Barker et al. | The CHiME challenges: Robust speech recognition in everyday environments | |
Zmolikova et al. | Neural target speech extraction: An overview | |
US11620983B2 (en) | Speech recognition method, device, and computer-readable storage medium | |
Barker et al. | The third ‘CHiME’speech separation and recognition challenge: Dataset, task and baselines | |
EP3707716B1 (en) | Multi-channel speech separation | |
Barker et al. | The fifth'CHiME'speech separation and recognition challenge: dataset, task and baselines | |
Barker et al. | The PASCAL CHiME speech separation and recognition challenge | |
Barker et al. | The third ‘CHiME’speech separation and recognition challenge: Analysis and outcomes | |
Wölfel et al. | Distant speech recognition | |
Hoshen et al. | Speech acoustic modeling from raw multichannel waveforms | |
Vincent et al. | The second ‘CHiME’speech separation and recognition challenge: An overview of challenge systems and outcomes | |
Harper | The automatic speech recogition in reverberant environments (ASpIRE) challenge | |
KR101991733B1 (en) | Systems and methods for speech transcription | |
Sugiura et al. | Rospeex: A cloud robotics platform for human-robot spoken dialogues | |
Bertin et al. | VoiceHome-2, an extended corpus for multichannel speech processing in real homes | |
Bertin et al. | A French corpus for distant-microphone speech processing in real homes | |
Agrawal et al. | A review on speech separation in cocktail party environment: challenges and approaches | |
Sivasankaran et al. | Discriminative importance weighting of augmented training data for acoustic model training | |
Opochinsky et al. | Single-microphone speaker separation and voice activity detection in noisy and reverberant environments | |
Nakadai et al. | A robot referee for rock-paper-scissors sound games | |
Kinoshita et al. | The REVERB challenge: A benchmark task for reverberation-robust ASR techniques | |
Tsiami et al. | ATHENA: A Greek multi-sensory database for home automation control | |
Li et al. | A fast convolutional self-attention based speech dereverberation method for robust speech recognition | |
Fu et al. | Ieee slt 2021 alpha-mini speech challenge: Open datasets, tracks, rules and baselines |