Flexible latent variable models for multi-task learning

Jian Zhang¹,
Zoubin Ghahramani^2,3 &
Yiming Yang³

1437 Accesses
54 Citations
3 Altmetric
Explore all metrics

Abstract

Given multiple prediction problems such as regression or classification, we are interested in a joint inference framework that can effectively share information between tasks to improve the prediction accuracy, especially when the number of training examples per problem is small. In this paper we propose a probabilistic framework which can support a set of latent variable models for different multi-task learning scenarios. We show that the framework is a generalization of standard learning methods for single prediction problems and it can effectively model the shared structure among different prediction tasks. Furthermore, we present efficient algorithms for the empirical Bayes method as well as point estimation. Our experiments on both simulated datasets and real world classification datasets show the effectiveness of the proposed models in two evaluation settings: a standard multi-task learning setting and a transfer learning setting.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Ando, R., & Zhang, T. (2004). A framework for learning predictive structures from multiple tasks and unlabeled data (Technical Report RC23462). IBM T.J. Watson Research Center, 45.
Argyriou, A., Evgeniou, T., & Pontil, M. (2006). Multi-task feature learning. In Advances in neural information processing systems (NIPS) 19. Cambridge: MIT Press.
Google Scholar
Baxter, J. (2000). A model of inductive bias learning. Journal of Artificial Intelligence Research, 12, 149–198.
MATH MathSciNet Google Scholar
Breiman, L., & Friedman, J. H. (1997). Predicting multivariate responses in multiple linear regression. Journal of the Royal Statistical Society B, 59(1), 3–54.
Article MATH MathSciNet Google Scholar
Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41–75.
Article Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B, 39, 1–38.
MATH MathSciNet Google Scholar
Evgeniou, T., Micchelli, C., & Pontil, M. (2005). Learning multiple tasks with kernel methods. Journal of Machine Learning Research, 6, 615–637.
MathSciNet Google Scholar
Ferguson, T. (1973). A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1, 209–230.
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: data mining, inference and prediction (1st ed.). Berlin: Springer.
MATH Google Scholar
Heskes, T. (2000). Empirical Bayes for learning to learn. In Proc. 17th international conf. on machine learning (pp. 367–374). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Jaakkola, T., & Jordan, M. (1997). A variational approach to Bayesian logistic regression models and their extensions. In Proceedings of 6th international workshop on AI and statistics.
Koller, D., & Sahami, M. (1997). Hierarchically classifying documents using very few words. In Proceedings of the 14th international conference on machine learning (ICML).
Lehmann, E., & Casella, G. (1998). Theory of point estimation (2nd ed.). Berlin: Springer.
MATH Google Scholar
Lenk, P., DeSarbo, W., Green, P., & Young, M. (1996). Hierarchical Bayes conjoint analysis: Recovery of partworth heterogeneity from reduced experimental designs. Marketing Science, 15(2), 173–191.
Article Google Scholar
McCullagh, P., & Nelder, J. A. (1989). Generalized Linear Models (2nd ed.). London: Chapman & Hall/CRC.
MATH Google Scholar
Silver, D., & Mercer, R. (2001). Selective functional transfer: Inductive bias from related tasks. In Proceedings of the IASTED international conference on artificial intelligence and soft computing (ASC2001) (pp. 182–189).
Silverman, B. (1986). Density estimation for statistics and data analysis. London: Chapman & Hall/CRC.
MATH Google Scholar
Tanner, M. A. (2005). Tools for statistical inference: methods for the exploration of posterior distributions and likelihood functions (3rd ed.). Berlin: Springer.
Google Scholar
Teh, Y., Seeger, M., & Jordan, M. (2005). Semiparametric latent factor models. In AISTAT.
Thrun, S., & Pratt, L. (1998). Learning to learn. Dordrecht: Kluwer Academic.
MATH Google Scholar
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in neural information processing systems (Vol. 8, pp. 640–646). Cambridge: MIT Press.
Google Scholar
Yu, K., Tresp, V., & Schwaighofer, A. (2005). Learning Gaussian processes from multiple tasks. In Proceedings of 22nd international conference on machine learning (ICML).
Zhang, J., Ghahramani, Z., & Yang, Y. (2005). Learning multiple related tasks using latent independent component analysis. In Neural information processing systems (NIPS) 18.

Download references

Author information

Authors and Affiliations

Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
Jian Zhang
Department of Engineering, University of Cambridge, Cambridge, CB2 1PZ, UK
Zoubin Ghahramani
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Zoubin Ghahramani & Yiming Yang

Authors

Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zoubin Ghahramani
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Zhang.

Additional information

Editor: Daniel L. Silver, Kristin Bennett, Richard Caruana.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Ghahramani, Z. & Yang, Y. Flexible latent variable models for multi-task learning. Mach Learn 73, 221–242 (2008). https://doi.org/10.1007/s10994-008-5050-1

Download citation

Received: 18 February 2007
Revised: 30 October 2007
Accepted: 01 March 2008
Published: 02 April 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s10994-008-5050-1

Flexible latent variable models for multi-task learning

Abstract

Article PDF

Similar content being viewed by others

A new transfer learning framework with application to model-agnostic multi-task learning

Collaborating Differently on Different Topics: A Multi-Relational Approach to Multi-Task Learning

From multi-label learning to cross-domain transfer: a model-agnostic approach

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Flexible latent variable models for multi-task learning

Abstract

Article PDF

Similar content being viewed by others

A new transfer learning framework with application to model-agnostic multi-task learning

Collaborating Differently on Different Topics: A Multi-Relational Approach to Multi-Task Learning

From multi-label learning to cross-domain transfer: a model-agnostic approach

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation