[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Topic-Dependent Language Model with Voting on Noun History

Published: 01 June 2010 Publication History

Abstract

Language models (LMs) are an important field of study in automatic speech recognition (ASR) systems. LM helps acoustic models find the corresponding word sequence of a given speech signal. Without it, ASR systems would not understand the language and it would be hard to find the correct word sequence. During the past few years, researchers have tried to incorporate long-range dependencies into statistical word-based n-gram LMs. One of these long-range dependencies is topic. Unlike words, topic is unobservable. Thus, it is required to find the meanings behind the words to get into the topic. This research is based on the belief that nouns contain topic information. We propose a new approach for a topic-dependent LM, where the topic is decided in an unsupervised manner. Latent Semantic Analysis (LSA) is employed to reveal hidden (latent) relations among nouns in the context words. To decide the topic of an event, a fixed size word history sequence (window) is observed, and voting is then carried out based on noun class occurrences weighted by a confidence measure. Experiments were conducted on an English corpus and a Japanese corpus: The Wall Street Journal corpus and Mainichi Shimbun (Japanese newspaper) corpus. The results show that our proposed method gives better perplexity than the comparative baselines, including a word-based/class-based n-gram LM, their interpolated LM, a cache-based LM, a topic-dependent LM based on n-gram, and a topic-dependent LM based on Latent Dirichlet Allocation (LDA). The n-best list rescoring was conducted to validate its application in ASR systems.

References

[1]
Bellegarda, J. R. 1998. A multi-span language modelling framework for large vocabulary speech recognition. IEEE Trans. Speech Audio Proc. 6, 456--457.
[2]
Bellegarda, J. R., Butzberger, J. W., Chow, Y.-L., Coccaro, N. B., and Naik, D. 1996. A novel word clustering algorithm based on latent semantic analysis. In Proceedings of the Acoustics, Speech, and Signal Processing (ASSP’96). 172--175.
[3]
Bilmes, J. A. and Kirchhoff, K. 2003. Factored language models and generalized parallel backoff. In Proceedings of the Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics (HLT/NACCL’03). 4--6.
[4]
Blei, D. M., Ng, A. Y., Jordan, M. I., and Lafferty, J. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993--1022.
[5]
Broman, S. and Kurimo, M. 2005. Methods for combining language models in speech recognition. In Proceedings of the European Conference on Speech Communication and Technology (INTERSPEECH’05). 1317--1320.
[6]
Brown, P. F., Pietra, V. J. D., deSouza, P. V., Lai, J. C., and Mercer, R. L. 1990. Class-based n-gram models of natural language. Comput. Linguist. 18, 18--4.
[7]
Chen, B. 2009. Word topic models for spoken document retrieval and transcription. Trans. Asian Lang. Inform. Process. 8, 1, 1--27.
[8]
Chen, K. 1995. Topic identification in discourse. In Proceedings of the 7th Conference on European Chapter of the Association for Computational Linguistics (ACL’95). 267--271.
[9]
Dhillon, I. S., Fan, J., and Guan, Y. 2001. Efficient clustering of very large document collections. Data Mining for Scientific and Engineering Applications, V. K. R. Grossman, C. Kamath, and R. Namburu, eds. Kluwer Academic Publishers, 357--381.
[10]
Gildea, D. and Hofmann, T. 1999. Topic-based language models using em. In Proceedings of the International Conference on Speech Communication and Technology (EUROSPEECH’99). 2167--2170.
[11]
Hofmann, T. 1999. Probabilistic latent semantic analysis. In Proceedings of the Uncertainty in Artificial Intelligence (UAI’99). 289--296.
[12]
Iyer, R. and Ostendorf, M. 1996. Modeling long distance dependence in language: Topic mixtures vs. dynamic cache models. IEEE Trans. Speech Audio Proc. 236--239.
[13]
Iyer, R., Ostendorf, M., and Rohlicek, J. R. 1994. Language modeling with sentence-level mixtures. In Proceedings of the Workshop on Human Language Technology (HLT’94). 82--87.
[14]
Jelinek, F. and Mercer, R. L. 1980. Interpolated estimation of markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice (WPPP’80).
[15]
Kakkonen, T., Myller, N., Sutinen, E., and Timonen, J. 2008. Comparison of dimension reduction methods for automated essay grading. Educ. Tech. Soc. 11, 3, 275--288.
[16]
Klakow, D. and Peters, J. 2002. Testing the correlation of word error rate and perplexity. Speech Comm. 38, 1-2, 19--28.
[17]
Kneser, R. and Ney, H. 1993. Improved clustering techniques for class-based statistical language modelling. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH’93). 973--976.
[18]
Kneser, R. and Steinbiss, V. 1993. On the dynamic adaptation of stochastic lm. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’93). 2, 586--589.
[19]
Kuhn, R. and de Mori, R. 1992. A cache based natural language model for speech recognition. IEEE Trans. Pattern Anal. Mach. Intell. 14, 570--583.
[20]
Liu, F. and Liu, Y. 2007. Unsupervised language model adaptation incorporating named entity information. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’07). 672--679.
[21]
Liu, Y. and Liu, F. 2008. Unsupervised language model adaptation via topic modeling based on named entity hypotheses. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’08). 4921--4924.
[22]
Nakagawa, S. and Murase, I. 1992. Relationship among phoneme/word recognition rate, perplexity and sentence recognition and comparison of language models. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’92). 1, 589--592.
[23]
Naptali, W., Masatoshi, T., and Nakagawa, S. 2009. Word co-occurrence matrix and context dependent class in lsa based language model for speech recognition. North Atlantic Univ. Union Inter. J. Comput. 3, 1, 85--95.
[24]
Rosenfeld, R. 1996. A maximum entropy approach to additive statistical language modeling. Comput. Speech Lang.
[25]
Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing (ICNMLP’94).
[26]
Stolcke, A. 2002. Srilm -- An extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing (ICSLP’02). 2, 901--904.
[27]
Strik, H., Cucchiarini, C., and Kessens, J. M. 2000. Comparing the recognition performance of csrs: In search of an adequate metric and statistical significance test. In Proceedings of the International Conference on Spoken Language Processing (ICSLP’00). 740--744.
[28]
Strik, H., Cucchiarini, C., and Kessens, J. M. 2001. Comparing the performance of two csrs: How to determine the significance level of the differences. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH’01). 3, 2091--2094.
[29]
Yung, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. 2005. The HTK Book (for HTK version 3.3). Cambridge.
[30]
Zhang, J., Wang, L., and Nakagawa, S. 2008. Lvcsr based on context dependent syllable acoustic models. In Proceedings of the Asian Workshop on Speech Science and Technology (SP’08). 81--86.

Cited By

View all
  • (2020)Venue Topic Model–enhanced Joint Graph Modelling for Citation Recommendation in Scholarly Big DataACM Transactions on Asian and Low-Resource Language Information Processing10.1145/340499520:1(1-15)Online publication date: 1-Dec-2020
  • (2017)Using language models for improving speech recognition for U.S. Open Tennis ChampionshipsIBM Journal of Research and Development10.1147/JRD.2017.271201961:4-5(15:1-15:9)Online publication date: 1-Jul-2017
  • (2014)Analysis of smoothing methods for language models on small Chinese corpora2014 International Conference on Machine Learning and Cybernetics10.1109/ICMLC.2014.7009658(499-505)Online publication date: Jul-2014
  • Show More Cited By

Index Terms

  1. Topic-Dependent Language Model with Voting on Noun History

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian Language Information Processing
    ACM Transactions on Asian Language Information Processing  Volume 9, Issue 2
    June 2010
    90 pages
    ISSN:1530-0226
    EISSN:1558-3430
    DOI:10.1145/1781134
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 2010
    Accepted: 01 February 2010
    Revised: 01 January 2010
    Received: 01 July 2009
    Published in TALIP Volume 9, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Language model
    2. latent semantic analysis
    3. perplexity
    4. speech recognition
    5. topic dependent

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Venue Topic Model–enhanced Joint Graph Modelling for Citation Recommendation in Scholarly Big DataACM Transactions on Asian and Low-Resource Language Information Processing10.1145/340499520:1(1-15)Online publication date: 1-Dec-2020
    • (2017)Using language models for improving speech recognition for U.S. Open Tennis ChampionshipsIBM Journal of Research and Development10.1147/JRD.2017.271201961:4-5(15:1-15:9)Online publication date: 1-Jul-2017
    • (2014)Analysis of smoothing methods for language models on small Chinese corpora2014 International Conference on Machine Learning and Cybernetics10.1109/ICMLC.2014.7009658(499-505)Online publication date: Jul-2014
    • (2014)Research on Topic Link Detection Method Based on Semantic DomainPervasive Computing and the Networked World10.1007/978-3-319-09265-2_33(325-334)Online publication date: 2014
    • (2012)Topic-Dependent-Class-Based $n$ -Gram Language ModelIEEE Transactions on Audio, Speech, and Language Processing10.1109/TASL.2012.218387020:5(1513-1525)Online publication date: 1-Jul-2012
    • (2012)An Improved Pruning Method Based on the Number of States Possessed by HypothesesProceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops10.1109/ICMEW.2012.106(576-580)Online publication date: 9-Jul-2012
    • (2010)Topic dependent class based language model evaluation on automatic speech recognition2010 IEEE Spoken Language Technology Workshop10.1109/SLT.2010.5700885(395-400)Online publication date: Dec-2010

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media