[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Sublinear optimization for machine learning

Published: 05 November 2012 Publication History

Abstract

In this article we describe and analyze sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sublinear-time algorithms were not known before. These new algorithms use a combination of a novel sampling techniques and a new multiplicative update algorithm. We give lower bounds which show the running times of many of our algorithms to be nearly best possible in the unit-cost RAM model.

References

[1]
Blum, A., Frieze, A. M., Kannan, R., and Vempala, S. 1998. A polynomial-time algorithm for learning noisy linear threshold functions. Algorithmica 22, 1/2, 35--52.
[2]
Bylander, T. 1994. Learning linear threshold functions in the presence of classification noise. In Proceedings of the 7th Annual Conference on Computational Learning Theory (COLT'94). ACM, New York, 340--347.
[3]
Cesa-Bianchi, N., Conconi, A., and Gentile, C. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theory 50, 9, 2050--2057.
[4]
Cesa-Bianchi, N., Shalev-Shwartz, S., and Shamir, O. 2010. Online learning of noisy data with kernels. In Proceedings of the Annual Conference on Learning Theory (COLT), A. T. Kalai and M. Mohri, Eds. Omnipress, 218--230.
[5]
Clarkson, K. L. 2008. Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. In Proceedings of the Proc. 19th ACM-SIAM Symposium on Discrete Algorithms. SIAM, Philadelphia, PA, 922--931.
[6]
Cover, T. and Thomas, J. A. 1991. Elements of Information Theory. Wiley Series in Telecommunications.
[7]
Diaconis, P. and Freedman, D. 1980. Finite exchangeable sequences. Ann. Probab. 8, 745--764.
[8]
Dunagan, J. and Vempala, S. 2004. A simple polynomial-time rescaling algorithm for solving linear programs. In Proceedings of the 36th Annual ACM Symposium on the Theory of Computing. ACM, New York, 315--320.
[9]
Frank, M. and Wolfe, P. 1956. An algorithm for quadratic programming. Naval Res. Logis. Quart. 3, 95--110.
[10]
Grigoriadis, M. D. and Khachiyan, L. G. 1995. A sublinear-time randomized approximation algorithm for matrix games. Oper. Res. Lett. 18, 53--58.
[11]
Hazan, E. 2011. The convex optimization approach to regret minimization. Optimiz. Mach. Learn. 1.
[12]
Hazan, E., Agarwal, A., and Kale, S. 2007. Logarithmic regret algorithms for online convex optimization. Mach. Lear. 69, 2-3, 169--192.
[13]
Koufogiannakis, C. and Young, N. E. 2007. Beating simplex for fractional packing and covering linear programs. In Proceedings of the Conference on Foundations of Computer Science (FOCS). IEEE Computer Society, 494--504.
[14]
Minsky, M. and Papert, S. 1988. Perceptrons: An Introduction to Computational Geometry. MIT Press Cambridge, MA.
[15]
Monemizadeh, M. and Woodruff, D. 2010. 1-pass relative error lp-sampling with applications. In Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms.
[16]
Motwani, R. and Raghavan, P. 1995. Randomized Algorithms. Cambridge University Press.
[17]
Novikoff, A. B. 1963. On convergence proofs for perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata. Vol. 12., 615--622.
[18]
Panconesi, A. and Srinivasan, A. 1997. Randomized distributed edge coloring via an extension of the chernoff-hoeffding bounds. SIAM J. Comput. 26, 2, 350--368.
[19]
Plotkin, S. A., Shmoys, D. B., and Tardos, E. 1991. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd Annual Symposium on Foundations of Computer Science. IEEE Computer Society, Los Alamitos, CA, 495--504.
[20]
Saha, A. and Vishwanathan, S. 2009. Efficient approximation algorithms for minimum enclosing convex shapes. arXiv:0909.1062v2.
[21]
Schölkopf, B. and Smola, A. J. 2003. A Short Introduction to Learning with Kernels. Springer, New York.
[22]
Servedio, R. A. 1999. On PAC learning using winnow, perceptron, and a perceptron-like algorithm. In Proceedings of the 12th Annual Conference on Computational Learning Theory. ACM, New York, 296--307.
[23]
Zinkevich, M. 2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning (ICML). 928--936.

Cited By

View all
  • (2024)Enhancing Forecasting Accuracy in Commodity and Financial Markets: Insights from GARCH and SVR ModelsInternational Journal of Financial Studies10.3390/ijfs1203005912:3(59)Online publication date: 26-Jun-2024
  • (2024)Stochastic Gradient SchemesStochastic Approximation: A Dynamical Systems Viewpoint10.1007/978-981-99-8277-6_11(187-221)Online publication date: 1-Feb-2024
  • (2023)On generalization bounds for projective clusteringProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669262(71723-71754)Online publication date: 10-Dec-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 59, Issue 5
October 2012
204 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/2371656
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2012
Accepted: 01 May 2012
Revised: 01 December 2011
Received: 01 April 2011
Published in JACM Volume 59, Issue 5

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)3
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Forecasting Accuracy in Commodity and Financial Markets: Insights from GARCH and SVR ModelsInternational Journal of Financial Studies10.3390/ijfs1203005912:3(59)Online publication date: 26-Jun-2024
  • (2024)Stochastic Gradient SchemesStochastic Approximation: A Dynamical Systems Viewpoint10.1007/978-981-99-8277-6_11(187-221)Online publication date: 1-Feb-2024
  • (2023)On generalization bounds for projective clusteringProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669262(71723-71754)Online publication date: 10-Dec-2023
  • (2023)Rank-based Decomposable Losses in Machine Learning: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.3296062(1-20)Online publication date: 2023
  • (2023)A smoothing proximal gradient algorithm with extrapolation for the relaxation of regularization problemComputational Optimization and Applications10.1007/s10589-022-00446-z84:3(737-760)Online publication date: 11-Jan-2023
  • (2022)A differentially private linear-time fPTAS for the minimum enclosing ball problemProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602564(31640-31652)Online publication date: 28-Nov-2022
  • (2022)Distributionally Robust Optimization: A review on theory and applicationsNumerical Algebra, Control & Optimization10.3934/naco.202105712:1(159)Online publication date: 2022
  • (2022)Quantum Differentially Private Sparse Regression LearningIEEE Transactions on Information Theory10.1109/TIT.2022.316472668:8(5217-5233)Online publication date: 1-Aug-2022
  • (2022)SVR and ARIMA models as machine learning solutions for solving the latency problem in real-time clock correctionsGPS Solutions10.1007/s10291-022-01270-y26:3Online publication date: 1-Jul-2022
  • (2022)Stochastic Gradient SchemesStochastic Approximation: A Dynamical Systems Viewpoint10.1007/978-81-951961-1-1_11(191-225)Online publication date: 2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media