[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1895550.1895709guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Acoustic modelling of subword units in the Isadora speech recognizer

Published: 23 March 1992 Publication History

Abstract

This paper addresses the choice of suitable subword units for the HMM-based front-end of a speaker-independent large vocabulary continuous speech dialog system (EVAR [1]). In contrast to the well-known approach of using context-dependent phone-like units (for instance generalized triphones) we developped inventories of larger sized subword units, so-called context-freezing units (CFU). CFU models can be considered as an approximation to the extremely desirable situation of having whole word HMMs under the limiting conditions of the training speech data at hand. Recognition experiments indicate an advantage of the context-freezing units over triphone/biphone/phone combinations in terms of the achieved word accuracy, at least in the case of German speech. Using triphones with contexts generalized by means of broad phonetic classes, we achieved results comparable to the CFU ones.

References

[1]
H. Niemann, A. Brietzmann, U. Ehrlich, S. Posch, P. Regel, G. Sagerer, R. Salzbrunn, and G. Schukat-Talamazzini. A knowledge based speech understanding system. Int. J. Pattern Recognition and Artificial Intelligence, 2(2):321-350, 1988.
[2]
K.-F. Lee. Automatic Speech Recognition: the Development of the SPHINX System. Kluwer Academic Publishers, Boston, 1989.
[3]
R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, and J. Makhoul. Context-dependent modeling for acoustic-phonetic recognition of continuous speech. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, pages 1205-1208, Tampa, Florida, 1985.
[4]
E.G. Schukat-Talamazzini and H. Niemann. Das ISADORASystem - ein almstisch-phonetisches Netzwerk zur automatischen Spracherkennung. In Proc. 13. DAGM-Symposium, pages 251-258, Springer, 1991.
[5]
L. Deng, M. Lennig, F. Seitz, and P. Mermelstein. Large vocabulary word recognition using context-dependent allophonic hidden markov models. Computer Speech & Language, 4(4):345-357, 1990.
[6]
E.G. Schukat-Talamazzini. Automatic generation and evaluation of phone superclasses for continuous speech recognition. In I.T. Young, editor, Signal Processing III: Theories and Applications (EUSIPCO-86), pages 537-540, Elsevier Publ., North-Holland, 1986.
[7]
G. Ruske and T. Schotola. The efficiency of demisyllable segmentation in the recognition of spoken words. In Proc. Int. Conf. on Acoustics, Speech:, and Signal Processing, pages 971-974, 1981.
[8]
S. Rieck, E.G. Schukat-Talamazzini, W. Eckert, T Kuhn, R. Kompe, A. Kießling, M. Mast, H. Niemann, and E. Nöth. Linear transformierte Bark-Spektrum-basierte Merkmale zur automatischen Spracherkennung. submitted for the 1992 DAGA conference, 1992.
[9]
M.J. Hunt, S.M. Richardson, D.C. Bateman, and A. Piau. An investigation of plp and imelda acoustic representations and of their potential for combination. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, pages 881-884, Toronto, 1991.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICASSP'92: Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
March 1992
646 pages
ISBN:0780305329

Sponsors

  • IEEE-SPS: Signal Processing Society

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 March 1992

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media