Gruhn et al., 2011 - Google Patents

Properties of Non-native Speech

Gruhn et al., 2011

Document ID: 6044905366344363439
Author: Gruhn R; Minker W; Nakamura S; Gruhn R; Minker W; Nakamura S
Publication year: 2011
Publication venue: Statistical Pronunciation Modeling for Non-Native Speech Processing

External Links

Cited by

Snippet

A typical example for a speech recognition system that needs to understand non-native speech is a car navigation system. In Europe, a driver who travels across borders and inputs a destination city or street name of a foreign country into a speech-controlled navigation …

Continue reading at link.springer.com (other versions)

239000011159 matrix material 0 description 7

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Taking into account non-speech caracteristics
- G10L2015/228—Taking into account non-speech caracteristics of application context
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/005—Language recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output

Similar Documents

Publication	Publication Date	Title
US9449599B2 (en)	2016-09-20	Systems and methods for adaptive proper name entity recognition and understanding
US11830485B2 (en)	2023-11-28	Multiple speech processing system with synthesized speech styles
KR102390940B1 (en)	2022-04-26	Context biasing for speech recognition
US11074909B2 (en)	2021-07-27	Device for recognizing speech input from user and operating method thereof
Jansen et al.	2013	A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition
JP7092953B2 (en)	2022-06-28	Phoneme-based context analysis for multilingual speech recognition with an end-to-end model
US9990916B2 (en)	2018-06-05	Method to synthesize personalized phonetic transcription
KR20210146368A (en)	2021-12-03	End-to-end automatic speech recognition for digit sequences
US20080177541A1 (en)	2008-07-24	Voice recognition device, voice recognition method, and voice recognition program
US9594744B2 (en)	2017-03-14	Speech transcription including written text
Luong et al.	2016	A non-expert Kaldi recipe for Vietnamese speech recognition system
JP7557085B2 (en)	2024-09-26	Instant text-to-speech learning during dialogue
Abed et al.	2019	Diacritics effect on Arabic speech recognition
Uebler	2001	Multilingual speech recognition in seven languages
EP3005152B1 (en)	2024-03-27	Systems and methods for adaptive proper name entity recognition and understanding
Abushariah et al.	2022	Bilingual automatic speech recognition: A review, taxonomy and open challenges
Ons et al.	2014	Fast vocabulary acquisition in an NMF-based self-learning vocal user interface
JP2010197644A (en)	2010-09-09	Speech recognition system
Gruhn et al.	2011	Properties of Non-native Speech
AbuZeina et al.	2012	Arabic speech recognition systems
Vékony	2016	Speech recognition challenges in the car navigation industry
Horndasch	2022	Using Contextual Information to Process Out-of-Vocabulary Words in Spoken Dialog Systems
Bacchiani et al.	2017	Speech research at Google to enable universal speech interfaces
Horndasch	2022	Using Contextual Information to Process Out-of-Vocabulary Words in Spoken Dialog Systems Ausnutzung von Kontextinformation bei der Verarbeitung von unbekannten
Thymé-Gobbel et al.	2021	Creating Robust Coverage for Speech-to-Text Resolution