Soky et al., 2016 - Google Patents

Building wfst based grapheme to phoneme conversion for khmer

Soky et al., 2016

Document ID: 7484526466208612738
Author: Soky K; Lu X; Shen P; Kato H; Kawai H; Vanna C; Chea V
Publication year: 2016
Publication venue: Proc. Khmer Natural Language Processing (KNLP)

External Links

Cited by

Snippet

Building pronunciation lexicon is one of the most important steps for building automatic speech recognition (ASR) systems and text-tospeech (TTS) systems. For a low-resource language, there is no or lack of such a dictionary (lexicon) cracked by experts. Even there is …

Continue reading at ksoky.github.io (PDF) (other versions)

238000006243 chemical reaction 0 title abstract description 16

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G06F17/2223—Handling non-latin characters, e.g. kana-to-kanji conversion
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
Ramani et al.	2013	A common attribute based unified HTS framework for speech synthesis in Indian languages
Schuster et al.	2012	Japanese and korean voice search
Kaur et al.	2014	Review of machine transliteration techniques
Alsharhan et al.	2019	Improved Arabic speech recognition system through the automatic generation of fine-grained phonetic transcriptions
Demberg et al.	2007	Phonological constraints and morphological preprocessing for grapheme-to-phoneme conversion
Alsharhan et al.	2022	Evaluating the effect of using different transcription schemes in building a speech recognition system for Arabic
Ananthakrishnan et al.	2005	Automatic diacritization of Arabic transcripts for automatic speech recognition
Cardenas et al.	2018	Siminchik: A speech corpus for preservation of southern quechua
Ashihara et al.	2023	SpeechGLUE: How well can self-supervised speech models capture linguistic knowledge?
Soky et al.	2016	Building wfst based grapheme to phoneme conversion for khmer
Nikulásdóttir et al.	2018	An Icelandic pronunciation dictionary for TTS
Besacier et al.	2006	ASR and translation for under-resourced languages
Qiao et al.	2010	Small-vocabulary speech recognition for resource-scarce languages
Sar et al.	2019	Applying linguistic G2P knowledge on a statistical grapheme-to-phoneme conversion in Khmer
Büchi et al.	2020	Zhaw-init at germeval 2020 task 4: Low-resource speech-to-text
Nursetyo	2018	LatAksLate: Javanese script translator based on Indonesian speech recognition using sphinx-4 and google API
Al-Haj et al.	2009	Pronunciation modeling for dialectal Arabic speech recognition
El-Hadi et al.	2017	Phonetisaurus-based letter-to-sound transcription for Standard Arabic
Nurtomo	2018	Greedy algorithms to optimize a sentence set near-uniformly distributed on syllable units and punctuation marks
Nouza et al.	2012	A study on adapting Czech automatic speech recognition system to Croatian language
Rajendran et al.	2015	Text processing for developing unrestricted Tamil text to speech synthesis system
Dasgupta et al.	2013	A joint source channel model for the English to Bengali back transliteration
Pandey et al.	2013	Development and suitability of indian languages speech database for building watson based asr system
Gros et al.	2006	SI-PRON pronunciation lexicon: a new language resource for Slovenian
Hlaing et al.	2020	Word Representations for Neural Network Based Myanmar Text-to-Speech S.