[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

On Learning Vector-Valued Functions

Published: 01 January 2005 Publication History

Abstract

In this letter, we provide a study of learning in a Hilbert space of vectorvalued functions. We motivate the need for extending learning theory of scalar-valued functions by practical considerations and establish some basic results for learning vector-valued functions that should prove useful in applications. Specifically, we allow an output space Y to be a Hilbert space, and we consider a reproducing kernel Hilbert space of functions whose values lie in Y. In this setting, we derive the form of the minimal norm interpolant to a finite set of data and apply it to study some regularization functionals that are important in learning theory. We consider specific examples of such functionals corresponding to multiple-output regularization networks and support vector machines, for both regression and classification. Finally, we provide classes of operator-valued kernels of the dot product and translation-invariant type.

References

[1]
Akhiezer, N.I., & Glazman, I.M. (1993). Theory of linear operators in Hilbert spaces (Vol. 1). New York: Dover.
[2]
Amodei, L. (1997). Reproducing kernels of vector-valued function spaces. In A. Le Meehaute, C. Rabut, & L. L. Schumaker (Eds.), Curves and Surfaces: Proceedings of Chamonix 1996. Nashville, TN: Vonderbilt University Press.
[3]
Aronszajn, N. (1950). Theory of reproducing kernels. Trans. Amer. Math. Soc., 686, 337-404.
[4]
Baldi, P., Pollastri, G., Frasconi, P., & Vullo, A. (2002). New machine learning methods for the prediction of protein topologies. In P. Frasconi & R. Shamir (Eds.), Artificial intelligence and heuristic methods for bioinformatic. Amsterdam: IOS Press.
[5]
Bennett, K.P., & Embrechts, M.J. (2003). An optimization perspective on partial least squares. In J. Suykens, G. Horrath, S. Basu, C. Micchelli, & J. Vandewalle (Eds.), Advances in kearning theory: Methods, models, and applications. (pp. 227- 250). Amsterdam: IOS Press.
[6]
Bennett, K.P., & Mangasarian, O.L. (1993). Multicategory discrimination via linear programming. Optimization Methods and Software, 3, 722-734.
[7]
Berberian, S.K., (1966). Notes on spectral theory. New York: Van Nostrand.
[8]
Beymer, D., & Poggio, T. (1996). Image representations for visual learning. Science, 272(5270), 1905-1909.
[9]
Breiman, L., & Friedman, J. (1997). Predicting multivariate responses in multiple linear regression (with discussion). J. Roy. Statist. Soc. B., 59, 3-37.
[10]
Burbea, J., & Masani, P. (1984). Banach and Hilbert spaces of vector-valued functions. New York: Pitman.
[11]
Cherkassky, V., & Mulier, F.(1998). Learning from data: Concepts, theory, and methods . New York: Wiley.
[12]
Cornford, D., Csato, L., Evans, D., & Opper, M. (2004). Bayesian analysis of the scatterometer wind retrieval inverse Problem: Some new approaches. Journal of the Royal Statistical Society B, 66, 1-17.
[13]
Cramer, K., & Singer, Y. (2001). On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2, 265- 292.
[14]
Cressie, N.A.C. (1993). Statistics for spatial data. New York: Wiley.
[15]
Cristianini N., & Shawe-Taylor, J. (2000). An introduction to support vector machines . Cambridge: Cambridge University Press.
[16]
Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. Proc. of the 10th ACM SIGKOD Int. Conf. on Knowledge Discovery and Data Mining, Seattle, WA.
[17]
Evgeniou, T., Pontil, M., & Poggio, T. (2000). Regularization networks and support vector machines. Advances in Computational Mathematics, 13, 1-50.
[18]
Fillmore, P.A. (1970). Notes on operator theory. New York: Van Nostrand.
[19]
FitzGerald, C.H., Micchelli, C. A., & Pinkus, A. M. (1995). Functions that preserve families of positive definite functions. Linear Algebra and Its Applications, 221, 83-102.
[20]
Franke, U., Gavrila, D., Goerzig, S., Lindner, F., Paetzold, F., & Woehler, C. (1998). Autonomous driving goes downtown. IEEE Intelligent Systems 13, 40-48.
[21]
Hastie, T., Tibshirani, R., & Friedman, J. (2002). The elements of statistical learning: Data mining, inference, and prediction. New York: Springer-Verlag.
[22]
Mangasarian, O.L. (1994). Nonlinear programming. Philadelphia: SIAM.
[23]
Melkman, A.A., & Micchelli, C.A., (1979). Optimal estimation of linear operators in Hilbert spaces from inaccurate data. SIAM Journal of Numerical Analysis, 16(1), 87-105.
[24]
Micchelli, C.A. (1995). Mathematical aspects of geometric modeling. Philadelphia: SIAM.
[25]
Micchelli, C.A., & Pontil, M. (2003). On learning vector-valued functions (Research Note No. RN/03/08). London: Department of Computer Science, University College London.
[26]
Micchelli, C.A., & Pontil, M. (2004). A function representation for learning in Banach spaces. In J. Shawe-Taylor & Y. Singer (Eds), Proceedings of the Seventeenth Annual Conference on Learning Theory. New York: Springer-Verlag.
[27]
Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares regression in reproducing kernel Hilbert spaces. Journal of Machine Learning Research, 2, 97-123.
[28]
Schölkopf, B., & Smola, A.J. (2002). Learning with kernels. Cambridge, MA: MIT Press.
[29]
Sejnowski, T.J., & Rosenberg, C.R. (1987). Parallel networks which learn to pronounce English text. Complex Systems, 1, 145-163.
[30]
Tikhonov, A. N., & Arsenin, V. Y. (1997). Solutions of ill-posed problems. Washington, DC: Winston.
[31]
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
[32]
Wahba, G. (1990). Splines models for observational data. Philadelphia: SIAM.
[33]
Weston, J., Chapelle, O., Elisseeff, A., Schölkopf, B., & Vapnik, V.N. (2003). Kernel dependency estimation. In Advances in neural information processing Systems, 15, Cambridge, MA: MIT Press.
[34]
Weston, J. & Watkins, C. (1998). Multi-class support vector machines. (Tech. Rep. No. CSD-TR-98-04). London: Department of Computer Science, Royal Holloway, University of London.
[35]
Williams, C.K.I. & Barber, D. (1998). Bayesian classification with gaussian processes. IEEE Trans. on Patt. Anal. Mach. Intel., 20, 1342-1351.

Cited By

View all
  • (2024)Scalable variable selection for two-view learning tasks with projection operatorsMachine Language10.1007/s10994-023-06433-7113:6(3525-3544)Online publication date: 1-Jun-2024
  • (2023)Score-based generative modeling through stochastic evolution equations in hilbert spacesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667767(37799-37812)Online publication date: 10-Dec-2023
  • (2023)Koopman kernel regressionProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666835(16207-16221)Online publication date: 10-Dec-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neural Computation
Neural Computation  Volume 17, Issue 1
January 2005
240 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 01 January 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Scalable variable selection for two-view learning tasks with projection operatorsMachine Language10.1007/s10994-023-06433-7113:6(3525-3544)Online publication date: 1-Jun-2024
  • (2023)Score-based generative modeling through stochastic evolution equations in hilbert spacesProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667767(37799-37812)Online publication date: 10-Dec-2023
  • (2023)Koopman kernel regressionProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666835(16207-16221)Online publication date: 10-Dec-2023
  • (2023)Nonparametric teaching for multiple learnersProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666463(7756-7786)Online publication date: 10-Dec-2023
  • (2023)Regularized behavior cloning for blocking the leakage of past action informationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666224(2128-2153)Online publication date: 10-Dec-2023
  • (2023)Learning nonlinear causal effects via kernel anchor regressionProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626016(1942-1952)Online publication date: 31-Jul-2023
  • (2023)Vector-valued control variatesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619770(32819-32846)Online publication date: 23-Jul-2023
  • (2023)Kernel-based learning of orthogonal functionsNeurocomputing10.1016/j.neucom.2023.126237545:COnline publication date: 26-Jul-2023
  • (2023)Learning system parameters from turing patternsMachine Language10.1007/s10994-023-06334-9112:9(3151-3190)Online publication date: 13-Jun-2023
  • (2023)Inverse learning in Hilbert scalesMachine Language10.1007/s10994-022-06284-8112:7(2469-2499)Online publication date: 22-Feb-2023
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media