Barker et al., 2017 - Google Patents

The CHiME challenges: Robust speech recognition in everyday environments

Barker et al., 2017

Document ID: 11392762392703662616
Author: Barker J; Marxer R; Vincent E; Watanabe S
Publication year: 2017
Publication venue: New era for robust speech recognition: Exploiting deep learning

External Links

Cited by

Snippet

The CHiME challenge series has been aiming to advance the development of robust automatic speech recognition for use in everyday environments by encouraging research at the interface of signal processing and statistical modelling. The series has been running …

Continue reading at inria.hal.science (PDF) (other versions)

230000003203 everyday 0 title abstract description 7

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions

Similar Documents

Publication	Publication Date	Title
Haeb-Umbach et al.	2019	Speech processing for digital home assistants: Combining signal processing with deep-learning techniques
Barker et al.	2017	The CHiME challenges: Robust speech recognition in everyday environments
Zmolikova et al.	2023	Neural target speech extraction: An overview
US11620983B2 (en)	2023-04-04	Speech recognition method, device, and computer-readable storage medium
Barker et al.	2015	The third ‘CHiME’speech separation and recognition challenge: Dataset, task and baselines
EP3707716B1 (en)	2021-12-01	Multi-channel speech separation
Barker et al.	2018	The fifth'CHiME'speech separation and recognition challenge: dataset, task and baselines
Barker et al.	2013	The PASCAL CHiME speech separation and recognition challenge
Barker et al.	2017	The third ‘CHiME’speech separation and recognition challenge: Analysis and outcomes
Wölfel et al.	2009	Distant speech recognition
Hoshen et al.	2015	Speech acoustic modeling from raw multichannel waveforms
Vincent et al.	2013	The second ‘CHiME’speech separation and recognition challenge: An overview of challenge systems and outcomes
Harper	2015	The automatic speech recogition in reverberant environments (ASpIRE) challenge
KR101991733B1 (en)	2019-06-21	Systems and methods for speech transcription
Sugiura et al.	2015	Rospeex: A cloud robotics platform for human-robot spoken dialogues
Bertin et al.	2019	VoiceHome-2, an extended corpus for multichannel speech processing in real homes
Bertin et al.	2016	A French corpus for distant-microphone speech processing in real homes
Agrawal et al.	2023	A review on speech separation in cocktail party environment: challenges and approaches
Sivasankaran et al.	2017	Discriminative importance weighting of augmented training data for acoustic model training
Opochinsky et al.	2025	Single-microphone speaker separation and voice activity detection in noisy and reverberant environments
Nakadai et al.	2008	A robot referee for rock-paper-scissors sound games
Kinoshita et al.	2017	The REVERB challenge: A benchmark task for reverberation-robust ASR techniques
Tsiami et al.	2014	ATHENA: A Greek multi-sensory database for home automation control
Li et al.	2019	A fast convolutional self-attention based speech dereverberation method for robust speech recognition
Fu et al.	2021	Ieee slt 2021 alpha-mini speech challenge: Open datasets, tracks, rules and baselines