More Web Proxy on the site http://driver.im/

research-article

Online strongly convex optimization with unknown delays

Authors:

Lijun ZhangAuthors Info & Claims

Machine Learning, Volume 111, Issue 3

Pages 871 - 893

https://doi.org/10.1007/s10994-021-06072-w

Published: 01 March 2022 Publication History

Abstract

We investigate the problem of online convex optimization with unknown delays, in which the feedback of a decision arrives with an arbitrary delay. Previous studies have presented delayed online gradient descent (DOGD), and achieved the regret bound of

O (\sqrt{D})

by only utilizing the convexity condition, where

D \geq T

is the sum of delays over T rounds. In this paper, we further exploit the strong convexity to improve the regret bound. Specifically, we first propose a variant of DOGD for strongly convex functions, and establish a better regret bound of

O (d log T)

, where d is the maximum delay. The essential idea is to let the learning rate decay with the total number of received feedback linearly. Furthermore, we extend the strongly convex variant of DOGD and its theoretical guarantee to the more challenging bandit setting by combining with the classical

(n + 1)

-point and two-point gradient estimators, where n is the dimensionality. To the best of our knowledge, this is the first work that solves online strongly convex optimization under the general delayed setting.

References

[1]

Abernethy, J. D., Bartlett, P. L., Rakhlin, A., & Tewari, A. (2008). Optimal stragies and minimax lower bounds for online convex games. In Proceedings of the 21st annual conference on learning theory (pp. 415–424).

[2]

Agarwal, A., Hazan, E., Kale, S., & Schapire, R. E. (2006). Algorithms for portfolio management based on the Newton method. In Proceedings of the 23rd international conference on machine learning (pp. 9–16).

[3]

Agarwal, A., Dekel, O., & Xiao, L. (2010). Optimal algorithms for online convex optimization with multi-point bandit feedback. In Proceedings of the 23rd annual conference on learning theory (pp. 28–40).

[4]

Blum A and Kalai A Universal portfolios with and without transaction costs Machine Learning 1999 35 3 193-205

[5]

Cesa-Bianchi, N., & Lugosi, G. (2006). Prediction, learning, and games. Cambridge University Press.

[6]

Duchi J, Hazan E, and Singer Y Adaptive subgradient methods for online learning and stochastic optimization Journal of Machine Learning Research 2011 12 2121-2159

[7]

Flaxman, A. D., Kalai, A. T., & McMahan, H. B. (2005). Online convex optimization in the bandit setting: Gradient descent without a gradient. In Proceedings of the 16th annual ACM-SIAM symposium on discrete algorithms (pp. 385–394).

[8]

Gaillard, P., Stoltz, G., & van Erven, T. (2014). A second-order bound with excess losses. In Proceedings of the 27th annual conference on learning theory (pp. 176–196).

[9]

Hazan E Introduction to online convex optimization Foundations and Trends in Optimization 2016 2 3–4 157-325

[10]

Hazan, E., & Kale, S. (2012). Projection-free online learning. In Proceedings of the 29th international conference on machine learning (pp. 1843–1850).

[11]

Hazan E, Agarwal A, and Kale S Logarithmic regret algorithms for online convex optimization Machine Learning 2007 69 2 169-192

[12]

He, X., Pan, J., Jin, O., Xu, T., Liu, B., Xu, T., Shi, Y., Atallah, A., Herbrich, R., Bowers, S., & Candela, J. Q. (2014). Practical lessons from predicting clicks on ads at facebook. In Proceedings of the 8th international workshop on data mining for online advertising (pp. 1–9).

[13]

Héliou, A., Mertikopoulos, P., & Zhou, Z. (2020). Gradient-free online learning in games with delayed rewards. In Proceedings of the 37th international conference on machine learning (pp. 4172–4181).

[14]

Joulani, P., György, A., & Szepesvári, C. (2013). Online learning under delayed feedback. In Proceedings of the 30th international conference on machine learning (pp. 1453–1461).

[15]

Joulani, P., György, A., & Szepesvári, C. (2016). Delay-tolerant online convex optimization: Unified analysis and adaptive-gradient algorithms. In Proceedings of the 30th AAAI conference on artificial Intelligence (pp. 1744–1750).

[16]

Khashabi, D., Quanrud, K., & Taghvaei, A. (2016). Adversarial delays in online strongly-convex optimization.arXiv:160506201v1.

[17]

Langford J, Smola AJ, and Zinkevich M Slow learners are fast Advances in Neural Information Processing Systems 2009 22 2331-2339

[18]

Li, B., Chen, T., & Giannakis, G. B. (2019). Bandit online learning with unknown delays. In Proceedings of the 22nd international conference on artificial Intelligence and statistics (pp. 993–1002).

[19]

McMahan, H. B., & Streeter, M. (2010). Adaptive bound optimization for online convex optimization. In Proceedings of the 23rd conference on learning theory (pp. 244–256).

[20]

McMahan HB and Streeter M Delay-tolerant algorithms for asynchronous distributed online learning Advances in Neural Information Processing Systems 2014 27 2915-2923

[21]

McMahan, H. B., Holt, G., Sculley, D., Young, M., Ebner, D., Grady, J., Nie, L., Phillips, T., Davydov, E., Golovin, D., Chikkerur, S., Liu, D., Wattenberg, M., Hrafnkelsson, A. M., Boulos, T., & Kubica, J. (2013). Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1222–1230).

[22]

Mesterharm, C. (2005). On-line learning with delayed label feedback. In Proceedings of the 16th international conference on algorithmic learning theory (pp. 399–413).

[23]

Quanrud K and Khashabi D Online learning with adversarial delays Advances in Neural Information Processing Systems 2015 28 1270-1278

[24]

Saha, A., & Tewari, A. (2011). Improved regret guarantees for online smooth convex optimization with bandit feedback. In Proceedings of the 14th international conference on artificial intelligence and statistics (pp. 636–642).

[25]

Shalev-Shwartz S Online learning and online convex optimization Foundations and Trends in Machine Learning 2011 4 2 107-194

[26]

Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated subgradient solver for SVM. In Proceedings of the 24th international conference on machine learning (pp. 807–814).

[27]

Shamir, O., & Szlak, L. (2017). Online learning with local permutations and delayed feedback. In Proceedings of the 34th international conference on machine learning (pp. 3086–3094).

[28]

Wan, Y., Tu, W. W., & Zhang, L. (2020). Projection-free distributed online convex optimization with

O (\sqrt{T})

communication complexity. In Proceedings of the 37th international conference on machine learning (pp. 9818–9828).

[29]

Wan, Y., Tu, W. W., & Zhang, L. (2021a). Strongly adaptive online learning over partial intervals. Science China Information Sciences.

[30]

Wan, Y., Wang, G., & Zhang, L. (2021b). Projection-free distributed online learning with strongly convex losses.arXiv:210311102

[31]

Wang, G., Lu, S., Cheng, Q., Tu, W. W., & Zhang, L. (2020). Sadam: A variant of adam for strongly convex functions. In International conference on learning representations (pp. 1–21).

[32]

Weinberger MJ and Ordentlich E On delayed prediction of individual sequences IEEE Transactions on Information Theory 2002 48 7 1959-1976

[33]

Zhang L, Lu S, and Zhou ZH Adaptive online learning in dynamic environments Advances in Neural Information Processing Systems 2018 31 1323-1333

[34]

Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th international conference on machine learning (pp. 928–936).

Cited By

Wan YYao CSong MZhang LSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Non-stationary online convex optimization with arbitrary delaysProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694115(49991-50011)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694115
Qian YZhang ZZhao PZhou Z(2024)Learning with Asynchronous LabelsACM Transactions on Knowledge Discovery from Data10.1145/366218618:8(1-27)Online publication date: 3-May-2024
https://dl.acm.org/doi/10.1145/3662186
Wu PHuang HLiu ZChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Online Sequential Decision-Making with Unknown DelaysProceedings of the ACM Web Conference 202410.1145/3589334.3645388(4028-4036)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645388
Show More Cited By

Recommendations

Regret bounded by gradual variation for online convex optimization

Recently, it has been shown that the regret of the Follow the Regularized Leader (FTRL) algorithm for online linear optimization can be bounded by the total variation of the cost vectors rather than the number of rounds. In this paper, we extend this ...
Regret in Online Combinatorial Optimization

We address online linear optimization problems when the possible actions of the decision maker are represented by binary vectors. The regret of the decision maker is the difference between her realized loss and the minimal loss she would have achieved ...
Logarithmic regret algorithms for online convex optimization

In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Machine Language

Machine Language Volume 111, Issue 3

Mar 2022

365 pages

ISSN:0885-6125

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2021.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 March 2022

Accepted: 22 September 2021

Revision received: 24 July 2021

Received: 10 May 2021

Author Tags

Qualifiers

Research-article

Funding Sources

national natural science foundation of china
jiangsu science foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wan YYao CSong MZhang LSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Non-stationary online convex optimization with arbitrary delaysProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694115(49991-50011)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694115
Qian YZhang ZZhao PZhou Z(2024)Learning with Asynchronous LabelsACM Transactions on Knowledge Discovery from Data10.1145/366218618:8(1-27)Online publication date: 3-May-2024
https://dl.acm.org/doi/10.1145/3662186
Wu PHuang HLiu ZChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Online Sequential Decision-Making with Unknown DelaysProceedings of the ACM Web Conference 202410.1145/3589334.3645388(4028-4036)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645388
Wan YTu WZhang LKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Online frank-wolfe with arbitrary delaysProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601702(19703-19715)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601702

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents