Abstract
The basic theory of hidden Markov models was developed and applied to problems in speech recognition in the late 1960’s, and has since then been applied to numerous problems, e.g. biological sequence analysis. In this paper we consider the problem of computing the most likely string generated by a given model, and its implications on the complexity of comparing hidden Markov models. We show that computing the most likely string, and approximating its probability within any constant factor, is NP-hard, and establish the NP-hardness of comparing two hidden Markov models under the Lα- and L 1-norms. We discuss the applicability of the technique used to other measures of distance between probability distributions. In particular we show that it cannot be used to prove NP-hardness of determining the Kullback-Leibler distance between the probability distributions of two hidden Markov models, or of comparing them under the L k -norm for any xed even integer k.
Supported by grants from Carlsbergfondet and the Prog. in Math. and Mol. Biology
Partially supported by the IST Programme of the EU under contract number IST-1999-14186 (ALCOM-FT)
funded by the University of Aarhus Research Foundation
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
N. Abe and M. K. Warmuth. On the computational complexity of approximating distributions by probabilistic automata. Machine Learning, 9:205–260, 1992.
K. Asai, S. Hayamizu, and K. Handa. Prediction of protein secondary structure by the hidden markov model. Comp. Appl. in the Biosciences, 9:141–146, 1993.
A. Bateman, E. Birney, R. Durbin, S. R. Eddy, K. L. Howe, and E. L. L. Sonnhammer. The Pfam protein families database. Nucleic Acid Research, 28:263–266, 2000.
T. Batu, L. Fortnow, R. Rubinfeld, W. D. Smith, and P. White. Testing that distributions are close. In Proc. 15th STOC, pages 259–269, 2000.
G. A. Churchill. Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol., 51:79–94, 1989.
T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, New York, 1991.
L. Engebretsen and J. Holmerin. Clique is hard to approximate within n(1-o(1)). In Proc. 27th ICALP, volume 1853 of Lecture Notes in Computer Science, pages 2–12, 2000.
J. Feigenbaum, S. Kannan, M. Strauss, and M. Viswanathan. An approximate l 1-difference algorithm for massive data streams. In Proc. 40th FOCS, pages 501–511, 1999.
J. Fong and M. Strauss. An approximate l p-difference algorithm for massive data streams. In Proc. 17th STACS, 2000.
J. Håstad. Clique is hard to approximate within n 1-∈. Acta Mathematica, 182:105–142, 1999.
A. Krogh. Two methods for improving performance of an HMM and their application for gene nding. In Proc. 5th ISMB, pages 179–186, 1997.
A. Krogh, M. Brown, I. S. Mian, K. Sjölander, and D. Haussler. Hidden markov models in computational biology: Applications to protein modeling. Jour. Mol. Biol, 235:1501–1531, 1994.
R. B. Lyngsø, C. N. S. Pedersen, and H. Nielsen. Metrics and similarity measures for hidden Markov models. In Proc. 7th ISMB, pages 178–186, 1999.
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Proc. of the IEEE, volume 77, pages 257–286, 1989.
Y. Singer and M. K. Warmuth. Training algorithms for hidden Markov models using entropy based distance functions. In Proc. 9th NIPS, pages 641–647, 1996.
E. L. L. Sonnhammer, G. von Heijne, and A. Krogh. A hidden Markov model for predicting transmembrane helices in protein sequences. In Proc. 6th ISMB, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lyngsø, R.B., Pedersen, C.N.S. (2001). Complexity of Comparing Hidden Markov Models. In: Eades, P., Takaoka, T. (eds) Algorithms and Computation. ISAAC 2001. Lecture Notes in Computer Science, vol 2223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45678-3_36
Download citation
DOI: https://doi.org/10.1007/3-540-45678-3_36
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42985-2
Online ISBN: 978-3-540-45678-0
eBook Packages: Springer Book Archive