Abstract
A popular strategy for dealing with large parameter estimation problems is to split the problem into manageable subproblems and solve them cyclically one by one until convergence. A well-known drawback of this strategy is slow convergence in low noise conditions. We propose using so-called pattern searches which consist of an exploratory phase followed by a line search. During the exploratory phase, a search direction is determined by combining the individual updates of all subproblems. The approach can be used to speed up several well-known learning methods such as variational Bayesian learning (ensemble learning) and expectation-maximization algorithm with modest algorithmic modifications. Experimental results show that the proposed method is able to reduce the required convergence time by 60–85% in realistic variational Bayesian learning problems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Attias, H.: Independent factor analysis, Neural Computation, 11(4) (1999),803–851.
Bazaraa, M. S., Sherali, H. D. and Shetty, C. M.: Nonlinear Programming: Theory and Algorithms, J. Wiley, 1993, Second Edition.
Bezdek, J. C. and Hathaway, R. J.: Some notes on alternating optimization, In: N. R. Pal and M. Sugeno,(eds), Advances in Soft Computing-AFSS 2002, Vol: 2275 of Lecture Notes in Artificial Intelligence, Springer-Verlag,pp. 288–300, 2002.
Bishop, C.: Neural Networks for Pattern Recognition, Clarendon Press, 1995.
Dempster, A. P., Laird, N. M. and Rubin, D. B.: Maximum likelihood from incomplete data via the EM algorithm, J. of the Royal Statistical Society, Series B (Methodological), 39(1) (1977), 1–38.
Fletcher, R.: Practical Methods of Optimization, J. Wiley, 1987,Second Edition.
Ghahramani, Z. and Hinton, G. E.: Variational learning for switching state-space models, Neural Computation, 12(4) (2000), 963–996.
Hinton, G. E. and van Camp, D.: Keeping Neural Networks Simple by Minimizing the Description Length of the Weights,In: Proc. of the 6th Ann. ACM Conf. on Computational Learning Theory,Santa Cruz, CA,USA, pp. 5–13, 1993.
Hooke, R. and Jeeves, T. A.: “Direct search” solution of numerical and statistical problems, J. of the ACM, 8(2) (1961), 212–229.
Hyvärinen, A., Karhunen, J. and Oja, E.: Independent Component Analysis,J. Wiley, 2001.
Jamshidian, M. and Jennrich, R. I.: Conjugate gradient acceleration of the EM algorithm, J. of the American Statistical Association, 88(421) (1993), 221–228.
Jordan, M., Ghahramani, Z., Jaakkola, T. and Saul, L.: An introduction to variational methods for graphical models, In: M. Jordan,(ed), Learning in Graphical Models, Cambridge,MA,USA: The MIT Press, pp. 105–161, 1999.
Lappalainen, H. and Honkela, A.: Bayesian nonlinear independent component analysis by multi-layer perceptrons, In: M. Girolami,(ed), Advances in Independent Component Analysis, Berlin: Springer-Verlag, pp. 93–121, 2000.
Lappalainen, H. and Miskin, J.: Ensemble learning, In: M. Girolami, (ed), Advances in Independent Component Analysis, Berlin: Springer-Verlag, pp. 75–92, 2000.
MacKay, D. J. C.: Developments in Probabilistic Modelling with Neural Networks -Ensemble Learning, In: Neural Networks: Artificial Intelligence and Industrial Applications. Proc. of the 3rd Annual Symposium on Neural Networks, pp. 191–198, 1995.
MacKay, D. J. C.: Ensemble learning for hidden Markov models, Available from http:// wol.ra.phy.cam.ac.uk/mackay/ 1997.
Neal, R. M. and Hinton, G. E.: A view of the EM algorithm that justifies incremental, sparse,and other variants, In: M. I. Jordan (ed), Learning in Graphical Models, Cambridge, MA, USA: The MIT Press, pp. 355–368, 1999.
Valpola, H. and Karhunen, J.: An unsupervised ensemble learning method for nonlinear dynamic state-space models, Neural Computation, 14(11) (2002), 2647–2692.
Valpola, H., Östman, T. and Karhunen, J.: Nonlinear Independent Factor Analysis by Hierarchical Models, In: Proc. 4th Int. Symp. on Independent Component Analysis and Blind Signal Separation (ICA2003), 2003, to appear.
Valpola, H., Raiko, T. and Karhunen, J.: Building Blocks for Hierarchical Latent Variable Models, In: Proc. 3rd Int. Conf. on Independent Component Analysis and Signal Separation (ICA2001), San Diego, USA, pp. 710–715, 2001.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Honkela, A., Valpola, H. & Karhunen, J. Accelerating Cyclic Update Algorithms for Parameter Estimation by Pattern Searches. Neural Processing Letters 17, 191–203 (2003). https://doi.org/10.1023/A:1023655202546
Issue Date:
DOI: https://doi.org/10.1023/A:1023655202546