More Web Proxy on the site http://driver.im/

research-article

A least squares formulation for a class of generalized eigenvalue problems in machine learning

Authors:

Jieping YeAuthors Info & Claims

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 977 - 984

https://doi.org/10.1145/1553374.1553499

Published: 14 June 2009 Publication History

Abstract

Many machine learning algorithms can be formulated as a generalized eigenvalue problem. One major limitation of such formulation is that the generalized eigenvalue problem is computationally expensive to solve especially for large-scale problems. In this paper, we show that under a mild condition, a class of generalized eigenvalue problems in machine learning can be formulated as a least squares problem. This class of problems include classical techniques such as Canonical Correlation Analysis (CCA), Partial Least Squares (PLS), and Linear Discriminant Analysis (LDA), as well as Hypergraph Spectral Learning (HSL). As a result, various regularization techniques can be readily incorporated into the formulation to improve model sparsity and generalization ability. In addition, the least squares formulation leads to efficient and scalable implementations based on the iterative conjugate gradient type algorithms. We report experimental results that confirm the established equivalence relationship. Results also demonstrate the efficiency and effectiveness of the equivalent least squares formulations on large-scale problems.

References

[1]

Agarwal, S., Branson, K., & Belongie, S. (2006). Higher order learning with graphs. International Conference on Machine Learning (pp. 17--24).

Digital Library

[2]

Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7, 2399--2434.

Digital Library

[3]

Bishop, C. M. (2006). Pattern recognition and machine learning. New York, NY: Springer.

Digital Library

[4]

d'Aspremont, A., Ghaoui, L., Jordan, M., & Lanckriet, G. (2004). A direct formulation for sparse PCA using semidefinite programming. Neural Information Processing Systems (pp. 41--48).

[5]

Donoho, D. (2006). For most large underdetermined systems of linear equations, the minimal 11-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59, 907--934.

[6]

Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32, 407.

[7]

Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labelled classification. Neural Information Processing Systems (pp. 681--687).

[8]

Friedman, J., Hastie, T., Höfling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. Annals of Applied Statistics, 302--332.

[9]

Golub, G. H., & Van Loan, C. F. (1996). Matrix computations. Baltimore, MD: Johns Hopkins Press.

[10]

Hale, E., Yin, W., & Zhang, Y. (2008). Fixed-point continuation for l ₁-minimization: Methodology and convergence. SIAM Journal on Optimization, 19, 1107--1130.

Digital Library

[11]

Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: Data mining, inference, and prediction. New York, NY: Springer.

[12]

Hotelling, H. (1936). Relations between two sets of variables. Biometrika, 28, 312--377.

[13]

Hull, J. J. (1994). A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 550--554.

Digital Library

[14]

Kazawa, H., Izumitani, T., Taira, H., & Maeda, E. (2005). Maximal margin labeling for multi-topic text categorization. Neural Information Processing Systems (pp. 649--656).

[15]

Paige, C. C., & Saunders, M. A. (1982). LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Transactions on Mathematical Software, 8, 43--71.

Digital Library

[16]

Rosipal, R., & Kräämer, N. (2006). Overview and recent advances in partial least squares. Subspace, Latent Structure and Feature Selection Techniques, Lecture Notes in Computer Science (pp. 34--51).

Digital Library

[17]

Saad, Y. (1992). Numerical methods for large eigenvalue problems. New York, NY: Halsted Press.

[18]

Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press.

Digital Library

[19]

Sun, L., Ji, S., & Ye, J. (2008a). Hypergraph spectral learning for multi-label classification. ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (pp. 668--676).

Digital Library

[20]

Sun, L., Ji, S., & Ye, J. (2008b). A least squares formulation for canonical correlation analysis. International Conference on Machine Learning (pp. 1024--1031).

Digital Library

[21]

Tao, D., Li, X., Wu, X., & Maybank, S. (2009). Geometric mean for subspace selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 260--274.

Digital Library

[22]

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B, 58, 267--288.

[23]

Yang, Y., & Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. International Conference on Machine Learning (pp. 412--420).

Digital Library

[24]

Ye, J. (2007). Least squares linear discriminant analysis. International Conference on Machine Learning (pp. 1087--1094).

Digital Library

Cited By

Nie FChen HXiang SZhang CYan SLi X(2024)On the Equivalence of Linear Discriminant Analysis and Least Squares RegressionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3208944(1-11)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3208944
Muñoz-Pichardo JPino-Mejías RCubiles-de-la-Vega MEnguix-González A(2024)Dimensionality reduction through clustering of variables and canonical correlationJournal of the Korean Statistical Society10.1007/s42952-024-00290-354:1(63-90)Online publication date: 26-Sep-2024
https://doi.org/10.1007/s42952-024-00290-3
Dufrenois FKhatib AHamlich MHamad D(2024)Collaborative and dynamic kernel discriminant analysis for large-scale problems: applications in multi-class learning and novelty detectionProgress in Artificial Intelligence10.1007/s13748-023-00309-6Online publication date: 22-Jan-2024
https://doi.org/10.1007/s13748-023-00309-6
Show More Cited By

Index Terms

A least squares formulation for a class of generalized eigenvalue problems in machine learning

Recommendations

Regularized Total Least Squares Based on Quadratic Eigenvalue Problem Solvers
Abstract
This paper presents a new computational approach for solving the Regularized Total Least Squares problem. The problem is formulated by adding a quadratic constraint to the Total Least Square minimization problem. Starting from the fact that a ...
Solving Generalized Least-Squares Problems with LSQR

An iterative method for solving augmented linear systems in a generalized least-squares sense is given. The method, LSQR(A^-1), is shown to be a natural extension of the LSQR algorithm of Paige and Saunders [ACM Trans. Math. Software, 8 (1982), pp. 43--...
Least squares parameter estimation and multi-innovation least squares methods for linear fitting problems from noisy data
Abstract
Least squares is an important method for solving linear fitting problems and quadratic optimization problems. This paper explores the properties of the least squares methods and the multi-innovation least squares methods. It ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

June 2009

1331 pages

ISBN:9781605585161

DOI:10.1145/1553374

General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University

Copyright © 2009 Copyright 2009 by the author(s)/owner(s).

Sponsors

NSF
Microsoft Research: Microsoft Research
MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Conference

ICML '09

Sponsor:

Microsoft Research

ICML '09: The 26th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 14 - 18, 2009

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
536
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nie FChen HXiang SZhang CYan SLi X(2024)On the Equivalence of Linear Discriminant Analysis and Least Squares RegressionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.3208944(1-11)Online publication date: 2024
https://doi.org/10.1109/TNNLS.2022.3208944
Muñoz-Pichardo JPino-Mejías RCubiles-de-la-Vega MEnguix-González A(2024)Dimensionality reduction through clustering of variables and canonical correlationJournal of the Korean Statistical Society10.1007/s42952-024-00290-354:1(63-90)Online publication date: 26-Sep-2024
https://doi.org/10.1007/s42952-024-00290-3
Dufrenois FKhatib AHamlich MHamad D(2024)Collaborative and dynamic kernel discriminant analysis for large-scale problems: applications in multi-class learning and novelty detectionProgress in Artificial Intelligence10.1007/s13748-023-00309-6Online publication date: 22-Jan-2024
https://doi.org/10.1007/s13748-023-00309-6
Nie FDong XHu ZWang RLi X(2023)Discriminative Projected Clustering via Unsupervised LDAIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320271934:11(9466-9480)Online publication date: Nov-2023
https://doi.org/10.1109/TNNLS.2022.3202719
Taskin GYetkin ECamps-Valls G(2023)A Scalable Unsupervised Feature Selection With Orthogonal Graph Representation for Hyperspectral ImagesIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2023.328447561(1-13)Online publication date: 2023
https://doi.org/10.1109/TGRS.2023.3284475
Chen YXiao GLi KPiccialli FZomaya A(2022)fgSpMSpV: A Fine-grained Parallel SpMSpV Framework on HPC PlatformsACM Transactions on Parallel Computing10.1145/35127709:2(1-29)Online publication date: 11-Apr-2022
https://dl.acm.org/doi/10.1145/3512770
Nicolaou MZafeiriou SPantic M(2022)A Unified Framework for Probabilistic Component AnalysisMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44851-9_30(469-484)Online publication date: 10-Mar-2022
https://dl.acm.org/doi/10.1007/978-3-662-44851-9_30
Sandhu PVerbrugge CHendren L(2022)A Hybrid Synchronization Mechanism for Parallel Sparse Triangular SolveLanguages and Compilers for Parallel Computing10.1007/978-3-030-99372-6_8(118-133)Online publication date: 24-Mar-2022
https://doi.org/10.1007/978-3-030-99372-6_8
Shen BXie WKong Z(2021)Clustered Discriminant Regression for High-Dimensional Data Feature Extraction and Its Applications in Healthcare and Additive ManufacturingIEEE Transactions on Automation Science and Engineering10.1109/TASE.2020.302902818:4(1998-2010)Online publication date: Oct-2021
https://doi.org/10.1109/TASE.2020.3029028
Liu WGong DTan MShi JYang YHauptmann A(2020)Learning Distilled Graph for Large-Scale Social Network Data ClusteringIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.290406832:7(1393-1404)Online publication date: 1-Jul-2020
https://doi.org/10.1109/TKDE.2019.2904068
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten