all-phoneme ergodic Hidden Markov Network" that combines allophonic (context-dependent phone) acoustic models with stochastic language constraints. Hidden Markov Network (HMnet) for allophone modeling and allophonic bigram probabilities derived from a large text database are combined to yield a single large ergodic HMM which represents arbitrary speech signals in a particular language so that the model parameters can be re-estimated using text-unknown speech samples with the Baum-Welch algorithm. When combined with the Vector Field Smoothing (VFS) technique, unsupervised speaker adaptation can be effectively performed. This method experimentally gave better performances compared with our previous unsupervised adaptation method which used conventional phonetic HMMs and phoneme bigram probabilities especially when the amount of training data was small." />
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/


Unsupervised Speaker Adaptation Using All-Phoneme Ergodic Hidden Markov Network

Yasunage MIYAZAWA
Jun-ichi TAKAMI
Shigeki SAGAYAMA
Shoichi MATSUNAGA

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E78-D    No.8    pp.1044-1050
Publication Date: 1995/08/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech Processing and Acoustics
Keyword: 
speech recognition,  unsupervised speaker adaptation,  all-phoneme ergodic hidden Markov network,  context-dependent phoneme bigram,  

Full Text: PDF(631.1KB)>>
Buy this Article



Summary: 
This paper proposes an unsupervised speaker adaptation method using an all-phoneme ergodic Hidden Markov Network" that combines allophonic (context-dependent phone) acoustic models with stochastic language constraints. Hidden Markov Network (HMnet) for allophone modeling and allophonic bigram probabilities derived from a large text database are combined to yield a single large ergodic HMM which represents arbitrary speech signals in a particular language so that the model parameters can be re-estimated using text-unknown speech samples with the Baum-Welch algorithm. When combined with the Vector Field Smoothing (VFS) technique, unsupervised speaker adaptation can be effectively performed. This method experimentally gave better performances compared with our previous unsupervised adaptation method which used conventional phonetic HMMs and phoneme bigram probabilities especially when the amount of training data was small.


open access publishing via