Abstract
Bayesian network learning is a useful tool for exploratory data analysis. However, applying Bayesian networks to the analysis of large-scale data, consisting of thousands of attributes, is not straightforward because of the heavy computational burden in learning and visualization. In this paper, we propose a novel method for large-scale data analysis based on hierarchical compression of information and constrained structural learning, i.e., hierarchical Bayesian networks (HBNs). The HBN can compactly visualize global probabilistic structure through a small number of hidden variables, approximately representing a large number of observed variables. An efficient learning algorithm for HBNs, which incrementally maximizes the lower bound of the likelihood function, is also suggested. The effectiveness of our method is demonstrated by the experiments on synthetic large-scale Bayesian networks and a real-life microarray dataset.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel- Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene Ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)
Batagelj, V., Mrvar, A.: Pajek - program for large network analysis. Connections 21(2), 47–57 (1998)
Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(6), 799–805 (2004)
Friedman, N., Nachman, I., Peér, D.: Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), pp. 206–215 (1999)
Goldenberg, A., Moore, A.: Tractable learning of large Bayes net structures from sparse data. In: Proceedings of the Twentifirst International Conference on Machine Learning, ICML (2004)
Gyftodimos, E., Flach, P.: Hierarchical Bayesian networks: an approach to classification and learning for structured data. In: Vouros, G.A., Panayiotopoulos, T. (eds.) SETN 2004. LNCS (LNAI), vol. 3025, pp. 291–300. Springer, Heidelberg (2004)
Hwang, K.-B., Lee, J.W., Chung, S.-W., Zhang, B.-T.: Construction of large-scale Bayesian networks by local to global search. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 375–384. Springer, Heidelberg (2002)
Nikovski, D.: Constructing Bayesian networks for medical diagnosis from incomplete and partially correct statistics. IEEE Transactions on Knowledge and Data Engineering 12(4), 509–516 (2000)
Park, S., Aggarwal, J.K.: Recognition of two-person interactions using a hierarchical Bayesian network. In: Proceedings of the First ACM SIGMM International Workshop on Video Surveillance (IWVS), pp. 65–76 (2003)
Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive identification of cell cycleregulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9(12), 3273–3297 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hwang, KB., Kim, BH., Zhang, BT. (2006). Learning Hierarchical Bayesian Networks for Large-Scale Data Analysis. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4232. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893028_75
Download citation
DOI: https://doi.org/10.1007/11893028_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46479-2
Online ISBN: 978-3-540-46480-8
eBook Packages: Computer ScienceComputer Science (R0)