[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Online learning in the embedded manifold of low-rank matrices

Published: 01 February 2012 Publication History

Abstract

When learning models that are represented in matrix forms, enforcing a low-rank constraint can dramatically improve the memory and run time complexity, while providing a natural regularization of the model. However, naive approaches to minimizing functions over the set of low-rank matrices are either prohibitively time consuming (repeated singular value decomposition of the matrix) or numerically unstable (optimizing a factored representation of the low-rank matrix). We build on recent advances in optimization over manifolds, and describe an iterative online learning procedure, consisting of a gradient step, followed by a second-order retraction back to the manifold. While the ideal retraction is costly to compute, and so is the projection operator that approximates it, we describe another retraction that can be computed efficiently. It has run time and memory complexity of O((n+m)k) for a rank-k matrix of dimension m×n, when using an online procedure with rank-one gradients. We use this algorithm, LORETA, to learn a matrix-form similarity measure over pairs of documents represented as high dimensional vectors. LORETA improves the mean average precision over a passive-aggressive approach in a factorized model, and also improves over a full model trained on pre-selected features using the same memory requirements. We further adapt LORETA to learn positive semi-definite low-rank matrices, providing an online algorithm for low-rank metric learning. LORETA also shows consistent improvement over standard weakly supervised methods in a large (1600 classes and 1 million images, using ImageNet) multilabel image classification task.

References

[1]
P.-A. Absil and J. Malick. Projection-like retractions on matrix manifolds. Technical Report UCL-INMA-2010.038, Department of Mathematical Engineering, Université catholique de Louvain, July 2010.
[2]
P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton Univ Press, 2008.
[3]
B. Bai, J. Weston, R. Collobert, and D. Grangier. Supervised semantic indexing. Advances in Information Retrieval, pages 761-765, 2009.
[4]
A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall. Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research, 6(1):937-965, 2006.
[5]
J. Briët, F.M. de Oliveira Filho, and F. Vallentin. The Grothendieck problem with rank constraint. In Proceedings of the 19th International Symposium on Mathematical Theory of Networks and Systems, MTNS, 2010.
[6]
G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. Journal of Machine Learning Research, 11:1109-1135, 2010.
[7]
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551-585, 2006.
[8]
J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In Proceedings of the 22nd IEEE Conference on Computer Vision and Pattern Recognition, pages 248-255, 2009.
[9]
J. Deng, A. Berg, and L. Fei-Fei. Hierarchical Semantic Indexing for Large Scale Image Retrieval. In Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, pages 785-792, 2011.
[10]
M.P. Do Carmo. Riemannian Geometry. Birkhauser, 1992.
[11]
L. Eldén and B. Savas. A Newton-Grassmann method for computing the best multi-linear rank-(r1, r2, r3) approximation of a tensor. SIAM Journal on Matrix Analysis and applications, 31(2): 248-271, 2009.
[12]
M. Fazel. Matrix Rank Minimization with Applications. PhD thesis, Electrical Engineering Department, Stanford University, 2002.
[13]
M. Fazel, H. Hindi, and S. Boyd. Rank minimization and applications in system theory. In Proceedings of the 2004 American Control Conference, pages 3273-3278. IEEE, 2005.
[14]
A. Globerson and S. Roweis. Metric learning by collapsing classes. In Advances in Neural Information Processing Systems, volume 18, page 451, 2006.
[15]
J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov. Neighbourhood components analysis. In Advances in Neural Information Processing Systems, volume 17, 2005.
[16]
D. Grangier and S. Bengio. A discriminative kernel-based model to rank images from text queries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30:1371-1384, 2008.
[17]
M. Ishteva, L. De Lathauwer, P.-A. Absil, and S. Van Huffel. Best low multilinear rank approximation of higher-order tensors, based on the Riemannian trust-region scheme. SIAM Journal on Matrix Analysis and Applications, 32(1):115-132, 2011.
[18]
P. Jain, B. Kulis, I.S. Dhillon, and K. Grauman. Online metric learning and fast similarity search. In Advances in Neural Information Processing Systems, volume 20, pages 761-768, 2008.
[19]
P. Jain, R. Meka, and I. Dhillon. Guaranteed rank minimization via singular value projection. In Advances in Neural Information Processing Systems, volume 24, pages 937-945, 2011.
[20]
M. Journée, F. Bach, P.-A. Absil, and R. Sepulchre. Low-Rank Optimization on the Cone of Positive Semidefinite Matrices. SIAM Journal on Optimization, 20(5):2327-2351, 2010a.
[21]
M. Journée, Y. Nesterov, P. Richtárik, and R. Sepulchre. Generalized power method for sparse principal component analysis. The Journal of Machine Learning Research, 11:517-553, 2010b.
[22]
S.M. Kakade, S. Shalev-Shwartz, and A. Tewari. Regularization techniques for learning with matrices, 2010. Arxiv preprint arXiv:0910.0610v2.
[23]
R.H. Keshavan, A. Montanari, and S. Oh. Matrix completion from noisy entries. The Journal of Machine Learning Research, 99:2057-2078, 2010.
[24]
B. Kulis, M.A. Sustik, and I.S. Dhillon. Low-rank kernel learning with bregman matrix divergences. The Journal of Machine Learning Research, 10:341-376, 2009.
[25]
K. Lang. Learning to filter netnews. In Proceeding of the 12th Internation Conference on Machine Learning, pages 331-339, 1995.
[26]
C.D. Manning, P. Raghavan, H. Schutze, and Ebooks Corporation. Introduction to Information Retrieval, volume 1. Cambridge University Press Cambridge, UK, 2008.
[27]
R. Meka, P. Jain, C. Caramanis, and I.S. Dhillon. Rank minimization via online learning. In Proceedings of the 25th International Conference on Machine learning, pages 656-663, 2008.
[28]
C.D. Meyer. Generalized inversion of modified matrices. SIAM Journal on Applied Mathematics, 24(3):315-323, 1973.
[29]
G. Meyer, S. Bonnabel, and R. Sepulchre. Regression on fixed-rank positive semidefinite matrices: a Riemannian approach. The Journal of Machine Learning Research, 12:593-625, 2011.
[30]
B.K. Natarajan. Sparse approximate solutions to linear systems. SIAM Journal on Computing, 24 (2):227-234, 1995.
[31]
Sahand Negahban and Martin J. Wainwright. Estimation of (near) low-rank matrices with noise and high-dimensional scaling. In Proceedings of the 27th International Conference on Machine Learning, pages 823-830, 2010.
[32]
C. Oberlin and S.J. Wright. Active set identification in nonlinear programming. SIAM Journal on Optimization, 17(2):577-605, 2007.
[33]
K.B. Petersen and M.S. Pedersen. The matrix cookbook, Oct. 2008.
[34]
B. Recht, M. Fazel, and P.A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3):471-501, 2010.
[35]
S. Shalev-Shwartz, Y. Singer, and A.Y. Ng. Online and batch learning of pseudo-metrics. In Proceedings of the Twenty-first International Conference on Machine Learning, page 94. ACM, 2004.
[36]
Uri Shalit, Daphna Weinshall, and Gal Chechik. Online learning in the manifold of low-rank matrices. In Advances in Neural Information Processing Systems 23, pages 2128-2136. MIT Press, 2010.
[37]
B. Vandereycken and S. Vandewalle. A Riemannian optimization approach for computing low-rank solutions of Lyapunov equations. SIAM Journal on Matrix Analysis and Applications, 31(5): 2553-2579, 2010.
[38]
B. Vandereycken, P.-A. Absil, and S. Vandewalle. Embedded geometry of the set of symmetric positive semidefinite matrices of fixed rank. In Statistical Signal Processing, 2009. SSP'09. IEEE/SP 15th Workshop on, pages 389-392. IEEE, 2009.
[39]
K.Q. Weinberger and L.K. Saul. Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research, 10:207-244, 2009.
[40]
J. Weston, S. Bengio, and N. Usunier. Wsabie: Scaling up to large vocabulary image annotation. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11), 2011.
[41]
E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russell. Distance metric learning, with application to clustering with side-information. In Advances in Neural Information Processing Systems, volume 15, pages 505-512. MIT Press, 2002.
[42]
L. Yang. An overview of distance metric learning. Technical report, School of Computer Science, Carnegie Mellon University, 2007.
[43]
Y. Yang and J.O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, pages 412-420, 1997.

Cited By

View all
  • (2024)On Geometric Connections of Embedded and Quotient Geometries in Riemannian Fixed-Rank Matrix OptimizationMathematics of Operations Research10.1287/moor.2023.137749:2(782-825)Online publication date: 1-May-2024
  • (2024)Implicit Low-Rank Riemannian Schemes for the Time Integration of Stiff Partial Differential EquationsJournal of Scientific Computing10.1007/s10915-024-02629-8101:1Online publication date: 13-Aug-2024
  • (2022)Multimodal-aware weakly supervised metric learning with self-weighting triplet lossMultimedia Tools and Applications10.1007/s11042-022-12053-581:28(41151-41173)Online publication date: 1-Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 13, Issue 1
January 2012
3712 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 February 2012
Published in JMLR Volume 13, Issue 1

Author Tags

  1. Riemannian manifolds
  2. low rank
  3. metric learning
  4. multitask learning
  5. online learning
  6. retractions

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)169
  • Downloads (Last 6 weeks)23
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)On Geometric Connections of Embedded and Quotient Geometries in Riemannian Fixed-Rank Matrix OptimizationMathematics of Operations Research10.1287/moor.2023.137749:2(782-825)Online publication date: 1-May-2024
  • (2024)Implicit Low-Rank Riemannian Schemes for the Time Integration of Stiff Partial Differential EquationsJournal of Scientific Computing10.1007/s10915-024-02629-8101:1Online publication date: 13-Aug-2024
  • (2022)Multimodal-aware weakly supervised metric learning with self-weighting triplet lossMultimedia Tools and Applications10.1007/s11042-022-12053-581:28(41151-41173)Online publication date: 1-Nov-2022
  • (2019)Nonconvex Weak Sharp Minima on Riemannian ManifoldsJournal of Optimization Theory and Applications10.1007/s10957-019-01539-2183:1(85-104)Online publication date: 1-Oct-2019
  • (2018)Inexact trust-region algorithms on Riemannian manifoldsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327338(4254-4265)Online publication date: 3-Dec-2018
  • (2018)Robust PCA by manifold optimizationThe Journal of Machine Learning Research10.5555/3291125.330964219:1(3101-3139)Online publication date: 1-Jan-2018
  • (2017)Structured Learning of Binary Codes with Column Generation for Optimizing Ranking MeasuresInternational Journal of Computer Vision10.1007/s11263-016-0984-4123:2(287-308)Online publication date: 1-Jun-2017
  • (2014)Low-Rank Modeling and Its Applications in Image AnalysisACM Computing Surveys10.1145/267455947:2(1-33)Online publication date: 19-Dec-2014

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media