Soky et al., 2016 - Google Patents
Building wfst based grapheme to phoneme conversion for khmerSoky et al., 2016
View PDF- Document ID
- 7484526466208612738
- Author
- Soky K
- Lu X
- Shen P
- Kato H
- Kawai H
- Vanna C
- Chea V
- Publication year
- Publication venue
- Proc. Khmer Natural Language Processing (KNLP)
External Links
Snippet
Building pronunciation lexicon is one of the most important steps for building automatic speech recognition (ASR) systems and text-tospeech (TTS) systems. For a low-resource language, there is no or lack of such a dictionary (lexicon) cracked by experts. Even there is …
- 238000006243 chemical reaction 0 title abstract description 16
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G06F17/2223—Handling non-latin characters, e.g. kana-to-kanji conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ramani et al. | A common attribute based unified HTS framework for speech synthesis in Indian languages | |
Schuster et al. | Japanese and korean voice search | |
Kaur et al. | Review of machine transliteration techniques | |
Alsharhan et al. | Improved Arabic speech recognition system through the automatic generation of fine-grained phonetic transcriptions | |
Demberg et al. | Phonological constraints and morphological preprocessing for grapheme-to-phoneme conversion | |
Alsharhan et al. | Evaluating the effect of using different transcription schemes in building a speech recognition system for Arabic | |
Ananthakrishnan et al. | Automatic diacritization of Arabic transcripts for automatic speech recognition | |
Cardenas et al. | Siminchik: A speech corpus for preservation of southern quechua | |
Ashihara et al. | SpeechGLUE: How well can self-supervised speech models capture linguistic knowledge? | |
Soky et al. | Building wfst based grapheme to phoneme conversion for khmer | |
Nikulásdóttir et al. | An Icelandic pronunciation dictionary for TTS | |
Besacier et al. | ASR and translation for under-resourced languages | |
Qiao et al. | Small-vocabulary speech recognition for resource-scarce languages | |
Sar et al. | Applying linguistic G2P knowledge on a statistical grapheme-to-phoneme conversion in Khmer | |
Büchi et al. | Zhaw-init at germeval 2020 task 4: Low-resource speech-to-text | |
Nursetyo | LatAksLate: Javanese script translator based on Indonesian speech recognition using sphinx-4 and google API | |
Al-Haj et al. | Pronunciation modeling for dialectal Arabic speech recognition | |
El-Hadi et al. | Phonetisaurus-based letter-to-sound transcription for Standard Arabic | |
Nurtomo | Greedy algorithms to optimize a sentence set near-uniformly distributed on syllable units and punctuation marks | |
Nouza et al. | A study on adapting Czech automatic speech recognition system to Croatian language | |
Rajendran et al. | Text processing for developing unrestricted Tamil text to speech synthesis system | |
Dasgupta et al. | A joint source channel model for the English to Bengali back transliteration | |
Pandey et al. | Development and suitability of indian languages speech database for building watson based asr system | |
Gros et al. | SI-PRON pronunciation lexicon: a new language resource for Slovenian | |
Hlaing et al. | Word Representations for Neural Network Based Myanmar Text-to-Speech S. |