Abstract
Building multilingual spoken language translation systems requires knowledge about both acoustic models and language models of each language to be translated. Our multilingual translation system JANUS-2 is able to translate English and German spoken input into either English, German, Spanish, Japanese or Korean output. Getting optimal acoustic and language models as well as developing adequate dictionaries for all these languages requires a lot of hand-tuning and is time-consuming and labor intensive. In this paper we will present learning techniques that improve acoustic models by automatically adapting codebook sizes, a learning algorithm that increases and adapts phonetic dictionaries for the recognition process and also a statistically based language model with some linguistic knowledge that increases recognition performance. To ensure a robust translation system, semantic rather than syntactic analysis is done. Concept based speech translation and a connectionist parser that learns to parse into feature structures are introduced. Furthermore, different repair mechanisms to recover from recognition errors will be described.
Our German recognition engine, developed at the University of Karlsruhe, is part of the VERBMOBIL project and VERBMOBIL systems developed under BMBF funding. The Spanish speech translation module has been developed at Carnegie Mellon University under project ENTHUSIAST funded by the US Government. Other components are under development in collaboration with partners of the C-STAR Consortium.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
L. Osterholtz, A. McNair, I. Rogina, H. Saito, T. Sloboda, J. Tebelskis, A. Waibel and M. Woszczyna: Testing Generality in JANUS: A Multi-Lingual Speech to Speech Translation System. Proc. ICASSP 92, vol. 1, pp. 209–212
M. Woszczyna, N. Coccaro, A. Eisele, A. Lavie, A. McNair, T. Polzin, I. Rogina, C.P. Rose, T. Sloboda, M. Tomita, J. Tsutsumi, N. Aoki-Waibel, A. Waibel and W. Ward: Recent Advances in JANUS: A Speech Translation System. Proc. EUROSPEECH 93, vol. 2, pp. 1295–1298
B. Suhm, P. Geutner, T. Kemp, A. Lavie, L. Mayfield, A.E. McNair, I. Rogina, T. Sloboda, W. Ward, M. Woszczyna and A. Waibel: JANUS: Towards Multilingual Spoken Language Translation. DARPA Speech and Natural Language Workshop 1994
W. Wahlster: Verbmobil: Translation of Face-To-Face Dialogs. DFKI, November 1993
C-STAR — Consortium for Speech Translation Research: Organization and Goals. Unpublished Notes, München, June 1994
H. Hild and A. Waibel: Speaker-Independent Connected Letter Recognition With a Multi-State Time Delay Neural Network. Proc. EUROSPEECH 93, vol. 2, pp. 1481–1484
O. Schmidbauer and J. Tebelskis: An LVQ based Reference Model for Speaker-Adaptive Speech Recognition. Proc. ICASSP 92, vol. 1, pp. 441–445
I. Rogina and A. Waibel: Learning State-Dependent Stream Weights for Multi-Codebook HMM Speech Recognition Systems. Proc. ICASSP 94
B. Suhm and A. Waibel: Towards Better Language Models for Spontaneous Speech. Proc. ICSLP 94, vol. 2, pp. 831–834
A. Lavie and M. Tomita: GLR* — An Efficient Noise-skipping Parsing Algorithm for Context-free Grammars. Proceedings of Third International Workshop on Parsing Technologies, 1993, pp. 123–134
B. Suhm, L. Levin, N. Coccaro, J. Carbonell, K. Horiguchi, R. Isotani, A. Lavie, L. Mayfield, C. Pennstein-Rosé, C. Van Ess-Dykema and A. Waibel: Speech-Language Integration in a Multi-Lingual Speech Translation System. Workshop on Integration of Natual Language and Speech Processing, AAAI-94, Seattle
F.-D. Buø, T.-S. Polzin and A. Waibel: Learning Complex Output Representations in Connectionist Parsing of Spontaneous Speech. Proc. ICASSP 94, vol. 1, pp. 365–368
L. Mayfield, M. Gavalda, W. Ward and A. Waibel: Concept-Based Speech Translation. Proc. ICASSP 95, vol. 1, pp. 97–100
C. Penstein Rosé and A. Waibel: Recovering From Parser Failures: A Hybrid Statistical/Symbolic Approach. to appear in “The Balancing Act: Combining Symbolic and Statistical Approaches to Language” Workshop at the 32nd Annual Meeting of the ACL, 1994
T. Kemp: Data-Driven Codebook Adaptation in phonetically tied SCHMMS. Proc. ICASSP 95, vol. 1, pp. 477–479
T. Sloboda: Dictionary Learning: Performance through Consistency. Proc. ICASSP 95, vol. 1, pp. 453–456
P. Geutner: Using Morphology towards better Large Vocabulary Speech Recognition Systems. Proc. ICASSP 95, vol. 1, pp. 445–448
U. Bodenhausen: Automatic Structuring of Neural Networks for Spatio-Temporal Real-World Applications. Ph.D thesis, University of Karlsruhe, June 1994
J.-L. Gauvin, L.-F. Lamel, G. Adda and M. Adda-Decker: The LIMSI Continuous Speech Dictation System: Evaluation on the ARPA Wall Street Journal Task. Proc. ICASSP 94, vol. 1, pp. 557—560
W. Ward: Understanding Spontaneous Speech: The Phoenix System. Proc. ICASSP 91, vol. 1, pp. 365–367
G. Gazdar, E. Klein, G.K. Pullum, and I.A. Sag: Generalized Phrase Structure Grammar. Blackwell Publishing, Oxford, England and Harvard University Press, Cambridge, MA, USA, 1985
R. Kaplan and J. Bresnan: Lexical-functional grammar: A formal system for grammatical representation. In The Mental Representation of Grammatical Relations, pp. 173–281. The MIT Press, Cambridge, MA, 1982.
C. Pollard and I. Sag: An Information-Based Syntax and Semantics. CSLI Lecture Notes No.13, 1987.
C. Nakatani and J. Hirschberg: A Speech-First Model for Repair Identification in Spoken Language Systems. Proc. of the ARPA Workshop on Human Language Technology, March 1993
A.-E. McNair and A. Waibel: Improving Recognizer Acceptance through Robust, Natural Speech Repair. Proc. ICSLP 94, vol. 3, pp. 1299–1303
S.-R. Young and W. Ward: Learning New Words from Spontaneous Speech. Proc. ICASSP 93, vol. 2, pp. 590–591
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Geutner, P. et al. (1996). Integrating different learning approaches into a multilingual spoken language translation system. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_42
Download citation
DOI: https://doi.org/10.1007/3-540-60925-3_42
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive