More Web Proxy on the site http://driver.im/

article

Free access

Online learning in the embedded manifold of low-rank matrices

Authors:

Daphna Weinshall,

Gal ChechikAuthors Info & Claims

The Journal of Machine Learning Research, Volume 13, Issue 1

Pages 429 - 458

Published: 01 February 2012 Publication History

PDF eReader Publisher Site

Abstract

When learning models that are represented in matrix forms, enforcing a low-rank constraint can dramatically improve the memory and run time complexity, while providing a natural regularization of the model. However, naive approaches to minimizing functions over the set of low-rank matrices are either prohibitively time consuming (repeated singular value decomposition of the matrix) or numerically unstable (optimizing a factored representation of the low-rank matrix). We build on recent advances in optimization over manifolds, and describe an iterative online learning procedure, consisting of a gradient step, followed by a second-order retraction back to the manifold. While the ideal retraction is costly to compute, and so is the projection operator that approximates it, we describe another retraction that can be computed efficiently. It has run time and memory complexity of O((n+m)k) for a rank-k matrix of dimension m×n, when using an online procedure with rank-one gradients. We use this algorithm, LORETA, to learn a matrix-form similarity measure over pairs of documents represented as high dimensional vectors. LORETA improves the mean average precision over a passive-aggressive approach in a factorized model, and also improves over a full model trained on pre-selected features using the same memory requirements. We further adapt LORETA to learn positive semi-definite low-rank matrices, providing an online algorithm for low-rank metric learning. LORETA also shows consistent improvement over standard weakly supervised methods in a large (1600 classes and 1 million images, using ImageNet) multilabel image classification task.

References

[1]

P.-A. Absil and J. Malick. Projection-like retractions on matrix manifolds. Technical Report UCL-INMA-2010.038, Department of Mathematical Engineering, Université catholique de Louvain, July 2010.

[2]

P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton Univ Press, 2008.

[3]

B. Bai, J. Weston, R. Collobert, and D. Grangier. Supervised semantic indexing. Advances in Information Retrieval, pages 761-765, 2009.

[4]

A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall. Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research, 6(1):937-965, 2006.

[5]

J. Briët, F.M. de Oliveira Filho, and F. Vallentin. The Grothendieck problem with rank constraint. In Proceedings of the 19th International Symposium on Mathematical Theory of Networks and Systems, MTNS, 2010.

[6]

G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. Journal of Machine Learning Research, 11:1109-1135, 2010.

[7]

K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551-585, 2006.

[8]

J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In Proceedings of the 22nd IEEE Conference on Computer Vision and Pattern Recognition, pages 248-255, 2009.

[9]

J. Deng, A. Berg, and L. Fei-Fei. Hierarchical Semantic Indexing for Large Scale Image Retrieval. In Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, pages 785-792, 2011.

[10]

M.P. Do Carmo. Riemannian Geometry. Birkhauser, 1992.

[11]

L. Eldén and B. Savas. A Newton-Grassmann method for computing the best multi-linear rank-(r1, r2, r3) approximation of a tensor. SIAM Journal on Matrix Analysis and applications, 31(2): 248-271, 2009.

[12]

M. Fazel. Matrix Rank Minimization with Applications. PhD thesis, Electrical Engineering Department, Stanford University, 2002.

[13]

M. Fazel, H. Hindi, and S. Boyd. Rank minimization and applications in system theory. In Proceedings of the 2004 American Control Conference, pages 3273-3278. IEEE, 2005.

[14]

A. Globerson and S. Roweis. Metric learning by collapsing classes. In Advances in Neural Information Processing Systems, volume 18, page 451, 2006.

[15]

J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood components analysis. In Advances in Neural Information Processing Systems, volume 17, 2005.

[16]

D. Grangier and S. Bengio. A discriminative kernel-based model to rank images from text queries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30:1371-1384, 2008.

[17]

M. Ishteva, L. De Lathauwer, P.-A. Absil, and S. Van Huffel. Best low multilinear rank approximation of higher-order tensors, based on the Riemannian trust-region scheme. SIAM Journal on Matrix Analysis and Applications, 32(1):115-132, 2011.

[18]

P. Jain, B. Kulis, I.S. Dhillon, and K. Grauman. Online metric learning and fast similarity search. In Advances in Neural Information Processing Systems, volume 20, pages 761-768, 2008.

[19]

P. Jain, R. Meka, and I. Dhillon. Guaranteed rank minimization via singular value projection. In Advances in Neural Information Processing Systems, volume 24, pages 937-945, 2011.

[20]

M. Journée, F. Bach, P.-A. Absil, and R. Sepulchre. Low-Rank Optimization on the Cone of Positive Semidefinite Matrices. SIAM Journal on Optimization, 20(5):2327-2351, 2010a.

[21]

M. Journée, Y. Nesterov, P. Richtárik, and R. Sepulchre. Generalized power method for sparse principal component analysis. The Journal of Machine Learning Research, 11:517-553, 2010b.

[22]

S.M. Kakade, S. Shalev-Shwartz, and A. Tewari. Regularization techniques for learning with matrices, 2010. Arxiv preprint arXiv:0910.0610v2.

[23]

R.H. Keshavan, A. Montanari, and S. Oh. Matrix completion from noisy entries. The Journal of Machine Learning Research, 99:2057-2078, 2010.

[24]

B. Kulis, M.A. Sustik, and I.S. Dhillon. Low-rank kernel learning with bregman matrix divergences. The Journal of Machine Learning Research, 10:341-376, 2009.

[25]

K. Lang. Learning to filter netnews. In Proceeding of the 12th Internation Conference on Machine Learning, pages 331-339, 1995.

[26]

C.D. Manning, P. Raghavan, H. Schutze, and Ebooks Corporation. Introduction to Information Retrieval, volume 1. Cambridge University Press Cambridge, UK, 2008.

[27]

R. Meka, P. Jain, C. Caramanis, and I.S. Dhillon. Rank minimization via online learning. In Proceedings of the 25th International Conference on Machine learning, pages 656-663, 2008.

[28]

C.D. Meyer. Generalized inversion of modified matrices. SIAM Journal on Applied Mathematics, 24(3):315-323, 1973.

[29]

G. Meyer, S. Bonnabel, and R. Sepulchre. Regression on fixed-rank positive semidefinite matrices: a Riemannian approach. The Journal of Machine Learning Research, 12:593-625, 2011.

[30]

B.K. Natarajan. Sparse approximate solutions to linear systems. SIAM Journal on Computing, 24 (2):227-234, 1995.

[31]

Sahand Negahban and Martin J. Wainwright. Estimation of (near) low-rank matrices with noise and high-dimensional scaling. In Proceedings of the 27th International Conference on Machine Learning, pages 823-830, 2010.

[32]

C. Oberlin and S.J. Wright. Active set identification in nonlinear programming. SIAM Journal on Optimization, 17(2):577-605, 2007.

[33]

K.B. Petersen and M.S. Pedersen. The matrix cookbook, Oct. 2008.

[34]

B. Recht, M. Fazel, and P.A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3):471-501, 2010.

[35]

S. Shalev-Shwartz, Y. Singer, and A.Y. Ng. Online and batch learning of pseudo-metrics. In Proceedings of the Twenty-first International Conference on Machine Learning, page 94. ACM, 2004.

[36]

Uri Shalit, Daphna Weinshall, and Gal Chechik. Online learning in the manifold of low-rank matrices. In Advances in Neural Information Processing Systems 23, pages 2128-2136. MIT Press, 2010.

[37]

B. Vandereycken and S. Vandewalle. A Riemannian optimization approach for computing low-rank solutions of Lyapunov equations. SIAM Journal on Matrix Analysis and Applications, 31(5): 2553-2579, 2010.

[38]

B. Vandereycken, P.-A. Absil, and S. Vandewalle. Embedded geometry of the set of symmetric positive semidefinite matrices of fixed rank. In Statistical Signal Processing, 2009. SSP'09. IEEE/SP 15th Workshop on, pages 389-392. IEEE, 2009.

[39]

K.Q. Weinberger and L.K. Saul. Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10:207-244, 2009.

[40]

J. Weston, S. Bengio, and N. Usunier. Wsabie: Scaling up to large vocabulary image annotation. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11), 2011.

[41]

E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russell. Distance metric learning, with application to clustering with side-information. In Advances in Neural Information Processing Systems, volume 15, pages 505-512. MIT Press, 2002.

[42]

L. Yang. An overview of distance metric learning. Technical report, School of Computer Science, Carnegie Mellon University, 2007.

[43]

Y. Yang and J.O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, pages 412-420, 1997.

Cited By

Luo YLi XZhang A(2024)On Geometric Connections of Embedded and Quotient Geometries in Riemannian Fixed-Rank Matrix OptimizationMathematics of Operations Research10.1287/moor.2023.137749:2(782-825)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1287/moor.2023.1377
Sutti MVandereycken B(2024)Implicit Low-Rank Riemannian Schemes for the Time Integration of Stiff Partial Differential EquationsJournal of Scientific Computing10.1007/s10915-024-02629-8101:1Online publication date: 13-Aug-2024
https://dl.acm.org/doi/10.1007/s10915-024-02629-8
Deng HMeng XFeng L(2022)Multimodal-aware weakly supervised metric learning with self-weighting triplet lossMultimedia Tools and Applications10.1007/s11042-022-12053-581:28(41151-41173)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s11042-022-12053-5
Show More Cited By

Index Terms

Online learning in the embedded manifold of low-rank matrices

Recommendations

Online learning in the embedded manifold of low-rank matrices

When learning models that are represented in matrix forms, enforcing a low-rank constraint can dramatically improve the memory and run time complexity, while providing a natural regularization of the model. However, naive approaches to minimizing ...
Low-rank nonnegative matrix factorization on Stiefel manifold
Abstract
Low rank is an important but ill-posed problem in the development of nonnegative matrix factorization (NMF) algorithms because the essential information is often encoded in a low-rank intrinsic data matrix, whereas noise and outliers ...
Kernel Methods on the Riemannian Manifold of Symmetric Positive Definite Matrices
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

Symmetric Positive Definite (SPD) matrices have become popular to encode image information. Accounting for the geometry of the Riemannian manifold of SPD matrices has proven key to the success of many algorithms. However, most existing methods only ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research

The Journal of Machine Learning Research Volume 13, Issue 1

January 2012

3712 pages

ISSN:1532-4435

EISSN:1533-7928

Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 February 2012

Published in JMLR Volume 13, Issue 1

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
459
Total Downloads

Downloads (Last 12 months)169
Downloads (Last 6 weeks)23

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Luo YLi XZhang A(2024)On Geometric Connections of Embedded and Quotient Geometries in Riemannian Fixed-Rank Matrix OptimizationMathematics of Operations Research10.1287/moor.2023.137749:2(782-825)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1287/moor.2023.1377
Sutti MVandereycken B(2024)Implicit Low-Rank Riemannian Schemes for the Time Integration of Stiff Partial Differential EquationsJournal of Scientific Computing10.1007/s10915-024-02629-8101:1Online publication date: 13-Aug-2024
https://dl.acm.org/doi/10.1007/s10915-024-02629-8
Deng HMeng XFeng L(2022)Multimodal-aware weakly supervised metric learning with self-weighting triplet lossMultimedia Tools and Applications10.1007/s11042-022-12053-581:28(41151-41173)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s11042-022-12053-5
Karkhaneei MMahdavi-Amiri N(2019)Nonconvex Weak Sharp Minima on Riemannian ManifoldsJournal of Optimization Theory and Applications10.1007/s10957-019-01539-2183:1(85-104)Online publication date: 1-Oct-2019
https://dl.acm.org/doi/10.1007/s10957-019-01539-2
Kasai HMishra B(2018)Inexact trust-region algorithms on Riemannian manifoldsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327338(4254-4265)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327338
Zhang TYang Y(2018)Robust PCA by manifold optimizationThe Journal of Machine Learning Research10.5555/3291125.330964219:1(3101-3139)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3291125.3309642
Lin GLiu FShen CWu JShen H(2017)Structured Learning of Binary Codes with Column Generation for Optimizing Ranking MeasuresInternational Journal of Computer Vision10.1007/s11263-016-0984-4123:2(287-308)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.1007/s11263-016-0984-4
Zhou XYang CZhao HYu W(2014)Low-Rank Modeling and Its Applications in Image AnalysisACM Computing Surveys10.1145/267455947:2(1-33)Online publication date: 19-Dec-2014
https://dl.acm.org/doi/10.1145/2674559

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents