Abstract
While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cressie, N.A.C.: Statistics for Spatial Data. John Wiley and Sons, New Jersey (1993)
Csató, L.: Gaussian Processes – Iterative Sparse Approximation. PhD thesis, Aston University, Birmingham, United Kingdom (2002)
Csató, L., Opper, M.: Sparse online gaussian processes. Neural Computation 14(3), 641–669 (2002)
Gibbs, M., MacKay, D.J.C.: Efficient implementation of gaussian processes. Technical report, Cavendish Laboratory, Cambridge University, Cambridge, United Kingdom (1997)
Lawrence, N., Seeger, M., Herbrich, R.: Fast sparse gaussian process methods: The informative vector machine. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Neural Information Processing Systems, vol. 15, pp. 609–616. MIT Press, Cambridge (2003)
MacKay, D.J.C.: Bayesian non-linear modelling for the energy prediction competition. ASHRAE Transactions 100(2), 1053–1062 (1994)
Cressie, N.A.C.: Statistics for Spatial Data. John Wiley and Sons, New Jersey (1993)
Neal, R.M.: Bayesian Learning for Neural Networks. Lecture Notes in Statistics, vol. 118. Springer, Heidelberg (1996)
Press, W., Flannery, B., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C, 2nd edn. Cambridge University Press, Cambridge (1992)
Rasmussen, C.E.: Evaluation of Gaussian Processes and Other Methods for Non-linear Regression. PhD thesis, Department of Computer Science, University of Toronto, Toronto, Ontario (1996)
Rasmussen, C.E.: Reduced rank gaussian process learning. Unpublished Manuscript (2002)
Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)
Schwaighofer, A., Tresp, V.: Transductive and inductive methods for approximate gaussian process regression. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15, pp. 953–960. MIT Press, Cambridge (2003)
Seeger, M.: Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations. PhD thesis, University of Edinburgh, Edinburgh, Scotland (2003)
Seeger, M., Williams, C., Lawrence, N.: Fast forward selection to speed up sparse gaussian process regression. In: Bishop, C.M., Frey, B.J. (eds.) Ninth International Workshop on Artificial Intelligence and Statistics, Society for Artificial Intelligence and Statistics (2003)
Smola, A.J., Bartlett, P.L.: Sparse greedy Gaussian process regression. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 619–625. MIT Press, Cambridge (2001)
Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Langley, P. (ed.) International Conference on Machine Learning, vol. 17, pp. 911–918. Morgan Kaufmann, San Francisco (2000)
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
Tresp, V.: A bayesian committee machine. Neural Computation 12(11), 2719–2741 (2000)
Wahba, G., Lin, X., Gao, F., Xiang, D., Klein, R., Klein, B.: The biasvariance tradeoff and the randomized GACV. In: Kerns, M.S., Solla, S.A., Cohn, D.A. (eds.) Advances in Neural Information Processing Systems, vol. 11, pp. 620–626. MIT Press, Cambridge (1999)
Williams, C.: Computation with infinite neural networks. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 295–301. MIT Press, Cambridge (1997a)
Williams, C.: Prediction with gaussian processes: From linear regression to linear prediction and beyond. Technical Report NCRG/97/012, Dept of Computer Science and Applied Mathematics, Aston University, Birmingham, United Kingdom (1997b)
Williams, C., Rasmussen, C.E., Schwaighofer, A., Tresp, V.: Observations of the nyström method for gaussiam process prediction. Technical report, University of Edinburgh, Edinburgh, Scotland (2002)
Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press, Cambridge (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Quiñonero-Candela, J., Rasmussen, C.E. (2005). Analysis of Some Methods for Reduced Rank Gaussian Process Regression. In: Murray-Smith, R., Shorten, R. (eds) Switching and Learning in Feedback Systems. Lecture Notes in Computer Science, vol 3355. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30560-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-30560-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24457-8
Online ISBN: 978-3-540-30560-6
eBook Packages: Computer ScienceComputer Science (R0)