More Web Proxy on the site http://driver.im/

research-article

Online learning: : A comprehensive survey

Authors:

Steven C.H. Hoi,

Peilin ZhaoAuthors Info & Claims

Volume 459, Issue C

Pages 249 - 289

https://doi.org/10.1016/j.neucom.2021.04.112

Published: 12 October 2021 Publication History

Abstract

Online learning represents a family of machine learning methods, where a learner attempts to tackle some predictive (or any type of decision-making) task by learning from a sequence of data instances one by one at each time. The goal of online learning is to maximize the accuracy/correctness for the sequence of predictions/decisions made by the online learner given the knowledge of correct answers to previous prediction/learning tasks and possibly additional information. This is in contrast to traditional batch or offline machine learning methods that are often designed to learn a model from the entire training data set at once. Online learning has become a promising technique for learning from continuous streams of data in many real-world applications. This survey aims to provide a comprehensive survey of the online machine learning literature through a systematic review of basic ideas and key principles and a proper categorization of different algorithms and techniques. Generally speaking, according to the types of learning tasks and the forms of feedback information, the existing online learning works can be classified into three major categories: (i) online supervised learning where full feedback information is always available, (ii) online learning with limited feedback, and (iii) online unsupervised learning where no feedback is available. Due to space limitation, the survey will be mainly focused on the first category, but also briefly cover some basics of the other two categories. Finally, we also discuss some open issues and attempt to shed light on potential future research directions in this field.

References

[1]

Y. Abbasi-Yadkori, D. Pál, C. Szepesvári, Improved algorithms for linear stochastic bandits, in: Advances in Neural Information Processing Systems, 2011, pp. 2312–2320.

[2]

J. Abernethy, K. Canini, J. Langford, A. Simma, Online collaborative filtering, (Tech. Rep.) University of California at Berkeley, 2007.

[3]

J. Abernethy, E. Hazan, A. Rakhlin, Competing in the dark: An efficient algorithm for bandit linear optimization, in: COLT, 2008, pp. 263–274.

[4]

M.R. Ackermann, M. Märtens, C. Raupach, K. Swierkot, C. Lammersen, C. Sohler, Streamkm++: a clustering algorithm for data streams, J. Exp. Algorithmics 17 (2012) 2–4.

[5]

A. Agarwal, E. Hazan, S. Kale, R.E. Schapire, Algorithms for portfolio management based on the newton method, ICML, ACM (2006) 9–16.

[6]

A. Agarwal, M.J. Wainwright, J.C. Duchi, Distributed dual averaging in networks, Advances in Neural Information Processing Systems (2010) 550–558.

[7]

R. Agarwal, A.A. Sekh, K. Agarwal, D.K. Prasad, Auxiliary network: scalable and agile online learning for dynamic system with inconsistently available inputs, 2020, arXiv preprint arXiv:2008.11828.

[8]

C.C. Aggarwal, A survey of stream clustering algorithms, 2013.

[9]

C.C. Aggarwal, J. Han, J. Wang, P.S. Yu, A framework for projected clustering of high dimensional data streams, in: VLDB, 2004.

[10]

S. Agmon, The relaxation method for linear inequalities, Can. J. Math. 6 (1954) 382–392.

[11]

S. Agrawal, N. Goyal, Analysis of thompson sampling for the multi-armed bandit problem, Conference on Learning Theory (2012) 31–39.

[12]

S. Agrawal, N. Goyal, Thompson sampling for contextual bandits with linear payoffs, International Conference on Machine Learning (2013) 127–135.

[13]

K. Akcoglu, P. Drineas, M.Y. Kao, Fast universalization of investment strategies, SIAM J. Comput. 34 (2004) 1–22.

[14]

S. Albers, Online algorithms: a survey, Math. Program. (2003).

[15]

M. Ali, C.C. Johnson, A.K. Tang, Parallel collaborative filtering for streaming data, (Tech. Rep.) University of Texas Austin, 2011.

[16]

A. Amini, T.Y. Wah, H. Saboohi, On density-based data streams clustering algorithms: a survey, J. Comput. Sci. Technol. 29 (2014) 116–141.

[17]

A. Amini, W. Ying, Dengris-stream: a density-grid based clustering algorithm for evolving data streams over sliding window, in: Proc. International Conference on Data Mining and Computer Engineering, 2012, pp. 206–210.

[18]

O. Anava, E. Hazan, S. Mannor, O. Shamir, Online learning for time series prediction, Conference on Learning Theory (2013) 172–184.

[19]

F. Angiulli, F. Fassetti, Detecting distance-based outliers in streams of data, in: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, ACM, 2007, pp. 811–820.

[20]

K. Ariu, N. Ryu, S.Y. Yun, A. Proutière, Regret in online recommendation systems, in: Advances in Neural Information Processing Systems 33, 2020.

[21]

R. Arora, A. Cotter, K. Livescu, N. Srebro, Stochastic optimization for pca and pls, in: Allerton Conference, Citeseer, 2012a, pp. 861–868.

[22]

R. Arora, A. Cotter, N. Srebro, Stochastic optimization of pca with capped msg, Advances in Neural Information Processing Systems (2013) 1815–1823.

[23]

S. Arora, E. Hazan, S. Kale, The multiplicative weights update method: a meta-algorithm and applications, Theory Comput. 8 (2012) 121–164.

[24]

A. Ashfahani, M. Pratama, Autonomous deep learning: continual learning approach for dynamic environments, in: Proceedings of the 2019 SIAM International Conference on Data Mining, 2019, pp. 666–674.

[25]

A. Ashfahani, M. Pratama, E. Lughofer, Y.S. Ong, Devdan: deep evolving denoising autoencoder, Neurocomputing 390 (2020) 297–314.

[26]

L.E. Atlas, D.A. Cohn, R.E. Ladner, Training connectionist networks with queries and selective sampling, in: D.S. Touretzky (Ed.), Advances in Neural Information Processing Systems 2, Morgan-Kaufmann, 1990, pp. 566–573.

[27]

J.Y. Audibert, R. Munos, C. Szepesvári, Exploration–exploitation tradeoff using variance estimates in multi-armed bandits, Theoret. Comput. Sci. 410 (2009) 1876–1902.

[28]

P. Auer, Using confidence bounds for exploitation-exploration trade-offs, J. Mach. Learn. Res. 3 (2002) 397–422.

[29]

P. Auer, N. Cesa-Bianchi, P. Fischer, Finite-time analysis of the multiarmed bandit problem, Mach. Learn. 47 (2002) 235–256.

[30]

P. Auer, N. Cesa-Bianchi, Y. Freund, R.E. Schapire, Gambling in a rigged casino: the adversarial multi-armed bandit problem, in: Focs, IEEE, 1995, p. 322.

[31]

P. Auer, N. Cesa-Bianchi, Y. Freund, R.E. Schapire, The nonstochastic multiarmed bandit problem, SIAM J. Comput. 32 (2002) 48–77.

Digital Library

[32]

G. BakIr, Predicting structured data, MIT press, 2007.

[33]

Y. Baram, R.E. Yaniv, K. Luz, Online choice of active learning algorithms, J. Mach. Learn. Res. 5 (2004) 255–291.

[34]

B. Barbaro, Tuning hyperparameters for online learning. Ph.D. thesis. Case Western Reserve University, 2018.

[35]

A.G. Barto, T.G. Dietterich, Reinforcement learning and its relationship to supervised learning, Handbook of learning and approximate dynamic programming 2 (2004) 47.

[36]

M. Belkin, P. Niyogi, V. Sindhwani, Manifold regularization: a geometric framework for learning from labeled and unlabeled examples, J. Mach. Learn. Res. 7 (2006) 2399–2434.

[37]

S. Ben-David, E. Kushilevitz, Y. Mansour, Online learning versus offline learning, Mach. Learn. 29 (1997) 45–63.

[38]

P. Berkhin, A survey of clustering data mining techniques, Grouping multidimensional data. Springer (2006) 25–71.

[39]

D.A. Berry, R.W. Chen, A. Zame, D.C. Heath, L.A. Shepp, Bandit problems with infinitely many arms, Ann. Stat. (1997) 2103–2116.

[40]

A. Beygelzimer, F. Orabona, C. Zhang, Efficient online bandit multiclass learning with Tregret, in: International Conference on Machine Learning, 2017.

[41]

V. Bhatnagar, S. Kaur, S. Chakravarthy, Clustering data streams using grid-based synopsis, Knowl. Inf. Syst. 41 (2014) 127–152.

[42]

H. Bhatt, R. Singh, M. Vatsa, N. Ratha, Improving cross-resolution face matching using ensemble based co-transfer learning, 2014.

[43]

H.S. Bhatt, R. Singh, M. Vatsa, N. Ratha, Matching cross-resolution face images using co-transfer learning, in: Image Processing (ICIP), 2012 19th IEEE International Conference on IEEE, 2012, pp. 1453–1456.

[44]

M. Biesialska, K. Biesialska, M.R. Costa-jussà, Continual lifelong learning in natural language processing: A survey, in: Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 6523–6541.

[45]

A. Blum, On-line algorithms in machine learning, Springer, 1998.

[46]

A.P. Boedihardjo, C.T. Lu, F. Chen, A framework for estimating complex probability density structures in data streams, in: Proceedings of the 17th ACM conference on Information and knowledge management ACM, 2008, pp. 619–628.

[47]

A. Borodin, R. El-Yaniv, V. Gogan, Can we learn to beat the best stock, Advances in Neural Information Processing Systems (2004) 345–352.

[48]

L. Bottou, Online algorithms and stochastic approximations, in: D. Saad (Ed.), Online Learning and Neural Networks, Cambridge University Press, Cambridge, UK. Revised, Oct 2012, 1998a.

[49]

L. Bottou, Online learning and stochastic approximations, On-line learning in neural networks 17 (1998) 142.

[50]

L. Bottou, Stochastic learning, Advanced lectures on machine learning. Springer (2004) 146–168.

[51]

L. Bottou, Large-scale machine learning with stochastic gradient descent, in: Proceedings of COMPSTAT’2010, Springer, 2010, pp. 177–186.

[52]

O. Bousquet, L. Bottou, The tradeoffs of large scale learning, Advances in neural information processing systems (2008) 161–168.

[53]

S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn. 3 (2011) 1–122.

Digital Library

[54]

Y. Bu, L. Chen, A.W.C. Fu, D. Liu, Efficient anomaly monitoring over moving object trajectory streams, in: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2009, pp. 159–168.

[55]

S. Bubeck, N. Cesa-Bianchi, Regret analysis of stochastic and nonstochastic multi-armed bandit problems, Found. Trends Mach. Learn. 5 (2012) 1–122.

[56]

S. Bubeck, N. Cesa-Bianchi, S.M. Kakade, et al., Towards minimax policies for online linear optimization with bandit feedback, in: COLT, 2012.

[57]

S. Bubeck, R. Munos, G. Stoltz, C. Szepesvári, X-armed bandits, J. Mach. Learn. Res. 12 (2011) 1655–1695.

[58]

C.J. Burges, et al., Dimension reduction: a guided tour, Mach. Learn. 2 (2010) 275–365.

[59]

F. Cao, M. Ester, W. Qian, A. Zhou, Density-based clustering over an evolving data stream with noise, in: SDM, SIAM, 2006, pp. 328–339.

[60]

Y. Cao, H. He, H. Man, Somke: Kernel density estimation over data streams by sequences of self-organizing maps, IEEE Trans. Neural Networks Learn. Syst. 23 (2012) 1254–1268.

[61]

Z. Cao, T. Qin, T.Y. Liu, M.F. Tsai, H. Li, Learning to rank: from pairwise approach to listwise approach, in: Proceedings of the 24th international conference on Machine learning, ACM, 2007, pp. 129–136.

[62]

G.A. Carpenter, S. Grossberg, N. Markuzon, J.H. Reynolds, D.B. Rosen, Fuzzy artmap: a neural network architecture for incremental supervised learning of analog multidimensional maps, Neural Networks IEEE Trans. 3 (1992) 698–713.

[63]

G.A. Carpenter, S. Grossberg, J.H. Reynolds, Artmap: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network, Neural Networks 4 (1991) 565–588.

Digital Library

[64]

R. Caruana, Multitask learning, Learning to learn. Springer (1998) 95–133.

[65]

G. Cavallanti, N. Cesa-Bianchi, C. Gentile, Tracking the best hyperplane with a simple budget perceptron, Mach. Learn. 69 (2007) 143–167.

[66]

G. Cavallanti, N. Cesa-Bianchi, C. Gentile, Linear algorithms for online multitask classification, J. Mach. Learn. Res. 11 (2010) 2901–2934.

[67]

N. Cesa-Bianchi, A. Conconi, C. Gentile, Learning probabilistic linear-threshold classifiers via selective sampling, in: Computational Learning Theory and Kernel Machines, 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003, 2003, pp. 373–387.

[68]

N. Cesa-Bianchi, A. Conconi, C. Gentile, On the generalization ability of on-line learning algorithms, IEEE Trans. Inf. Theory 50 (2004) 2050–2057.

[69]

N. Cesa-Bianchi, A. Conconi, C. Gentile, A second-order perceptron algorithm, SIAM J. Comput. 34 (2005) 640–668.

[70]

N. Cesa-Bianchi, C. Gentile, Improved risk tail bounds for on-line algorithms, IEEE Trans. Inf. Theory 54 (2008) 386–390.

[71]

N. Cesa-Bianchi, C. Gentile, F. Orabona, Robust bounds for classification via selective sampling, in: Proceedings of the 26th Annual International Conference on Machine Learning, ICML2009, 2009, pp. 121–128.

[72]

N. Cesa-Bianchi, C. Gentile, L. Zaniboni, Worst-case analysis of selective sampling for linear classification, J. Mach. Learn. Res. 7 (2006) 1205–1230.

[73]

N. Cesa-Bianchi, G. Lugosi, Prediction, learning, and games, Cambridge University Press, New York, NY, USA, 2006.

[74]

N. Cesa-Bianchi, G. Lugosi, Combinatorial bandits, J. Comput. Syst. Sci. 78 (2012) 1404–1422.

[75]

N. Cesa-Bianchi, G. Lugosi, G. Stoltz, Minimizing regret with label efficient prediction, IEEE Trans. Inf. Theory 51 (2005) 2152–2162.

[76]

N. Cesa-Bianchi, O. Shamir, Efficient transductive online learning via randomized rounding, Empirical Inference. Springer (2013) 177–194.

[77]

V. Chandola, A. Banerjee, V. Kumar, Anomaly detection: A survey, ACM computing surveys (CSUR) 41 (2009) 15.

[78]

Y.W. Chang, C.J. Hsieh, K.W. Chang, M. Ringgaard, C.J. Lin, Training and testing low-degree polynomial data mappings via linear svm, J. Mach. Learn. Res. 11 (2010) 1471–1490.

[79]

O. Chapelle, S.S. Keerthi, Efficient algorithms for ranking with svms, Inf. Retrieval 13 (2010) 201–215.

[80]

O. Chapelle, L. Li, An empirical evaluation of thompson sampling, Advances in neural information processing systems (2011) 2249–2257.

[81]

C. Chatfield, Time-series forecasting, CRC Press, 2000.

[82]

K. Chaudhuri, Y. Freund, D.J. Hsu, A parameter-free hedging algorithm, Advances in neural information processing systems (2009) 297–305.

[83]

G. Chen, G. Chen, J. Zhang, S. Chen, C. Zhang, Beyond banditron: A conservative and efficient reduction for online multiclass prediction with bandit setting model, in: 9th IEEE International Conference on Data Mining (ICDM2009), 2009, pp. 71–80.

[84]

N. Chen, S.C. Hoi, S. Li, X. Xiao, Simapp: A framework for detecting similar mobile applications by online kernel learning, in: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, ACM, 2015, pp. 305–314.

[85]

N. Chen, S.C. Hoi, S. Li, X. Xiao, Mobile app tagging, in: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, ACM, 2016, pp. 63–72.

[86]

W. Chen, Y. Wang, Y. Yuan, Q. Wang, Combinatorial multi-armed bandit and its extension to probabilistically triggered arms, J. Mach. Learn. Res. 17 (2016) 1–33.

[87]

Y. Chen, L. Tu, Density-based clustering for real-time stream data, in: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2007, pp. 133–142.

[88]

Z. Chen, Z. Fang, W. Fan, A. Edwards, K. Zhang, Cstg: An effective framework for cost-sensitive sparse online learning, in: Proceedings of the 2017 SIAM International Conference on Data Mining SIAM, 2017, pp. 759–767.

[89]

A. Chernov, V. Vovk, Prediction with advice of unknown number of experts, in: Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, AUAI Press, 2010, pp. 117–125.

[90]

S.R. Chowdhury, A. Gopalan, On kernelized multi-armed bandits, in: International Conference on Machine Learning, 2017.

[91]

W. Chu, L. Li, L. Reyzin, R.E. Schapire, Contextual bandits with linear payoff functions, in: AISTATS, 2011, pp. 208–214.

[92]

M. Clements, D. Hendry, Forecasting economic time series, Cambridge University Press, 1998.

[93]

R. Combes, M.S.T.M. Shahi, A. Proutiere, et al., Combinatorial bandits revisited, Advances in Neural Information Processing Systems (2015) 2116–2124.

[94]

T.M. Cover, Universal portfolios, in: The Kelly Capital Growth Investment Criterion: Theory and Practice. World Scientific, 2011, pp. 181–209.

[95]

K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, Y. Singer, Online passive-aggressive algorithms, J. Mach. Learn. Res. 7 (2006) 551–585.

[96]

K. Crammer, M. Dredze, A. Kulesza, Multi-class confidence weighted algorithms, in: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009, pp. 496–504.

[97]

K. Crammer, C. Gentile, Multiclass classification with bandit feedback using adaptive regularization, in: Proceedings of 28th International Conference on Machine Learning (ICML2011), 2011, pp. 273–280.

[98]

K. Crammer, J.S. Kandola, Y. Singer, Online classification on a budget, in: NIPS, 2003, p. 5.

[99]

K. Crammer, A. Kulesza, M. Dredze, Adaptive regularization of weight vectors, Mach. Learn. (2009) 1–33.

[100]

K. Crammer, D.D. Lee, Learning via gaussian herding, Advances in neural information processing systems (2010) 451–459.

[101]

K. Crammer, Y. Singer, Online ranking by projecting, Neural Comput. 17 (2005) 145–175.

[102]

K. Crammer, Y. Singer, et al., Pranking with ranking., in: Nips, 2001, pp. 641–647.

[103]

A. Cutkosky, K.A. Boahen, Online convex optimization with unconstrained domains and losses, Advances In Neural Information Processing Systems (2016) 748–756.

[104]

A.S. Das, M. Datar, A. Garg, S. Rajaram, Google news personalization: scalable online collaborative filtering, in: Proceedings of the 16th international conference on World Wide Web, ACM, 2007, pp. 271–280.

[105]

J.V. Davis, B. Kulis, P. Jain, S. Sra, I.S. Dhillon, Information-theoretic metric learning, in: Proceedings of the 24th international conference on Machine learning, ACM, 2007, pp. 209–216.

[106]

O. Dekel, C. Gentile, K. Sridharan. Robust selective sampling from single and multiple teachers, in: COLT 2010 - The 23rd Conference on Learning Theory, Haifa, Israel, June 27–29, 2010, pp. 346–358.

[107]

O. Dekel, R. Gilad-Bachrach, O. Shamir, L. Xiao, Optimal distributed online prediction using mini-batches, J. Mach. Learn. Res. 13 (2012) 165–202.

[108]

O. Dekel, P.M. Long, Y. Singer, Online multitask learning, International Conference on Computational Learning Theory, Springer (2006) 453–467.

[109]

O. Dekel, S. Shalev-Shwartz, Y. Singer, The forgetron: a kernel-based perceptron on a fixed budget, in: NIPS, 2005.

[110]

T.G. Dietterichx, Machine learning for sequential data: a review, in: Structural, syntactic, and statistical pattern recognition. Springer, 2002, pp. 15–30.

[111]

Y. Ding, P. Zhao, S.C. Hoi, Y.S. Ong, An adaptive gradient method for online auc maximization, in: Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015.

[112]

S. Disabato, M. Roveri, Learning convolutional neural networks in presence of concept drift, in: 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, 2019, pp. 1–8.

[113]

M. Dredze, K. Crammer. Active learning with confidence, in: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers, 2008, pp. 233–236.

[114]

M. Dredze, K. Crammer, F. Pereira, Confidence-weighted linear classification, in: Proceedings of the 25th international conference on Machine learning ACM, 2008, pp. 264–271.

[115]

Y. Du, Z. Tan, Q. Chen, Y. Zhang, C. Wang, Homogeneous online transfer learning with online distribution discrepancy minimization, 2019, arXiv preprint arXiv:1912.13226.

[116]

J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res. 12 (2011) 2121–2159.

[117]

J. Duchi, Y. Singer, Efficient online and batch learning using forward backward splitting, J. Mach. Learn. Res. 10 (2009) 2899–2934.

[118]

J.C. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res. 12 (2011) 2121–2159.

[119]

J.C. Duchi, S. Shalev-Shwartz, Y. Singer, A. Tewari, Composite objective mirror descent, COLT (2010) 14–26.

[120]

R. Elwell, R. Polikar, Incremental learning of concept drift in nonstationary environments, Neural Networks IEEE Trans. 22 (2011) 1517–1531.

[121]

T. van Erven, W.M. Koolen, Metagrad: Multiple learning rates in online learning, Advances in Neural Information Processing Systems (2016) 3666–3674.

[122]

T. Evgeniou, M. Pontil, Regularized multi–task learning, in: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2004, pp. 109–117.

[123]

J. Farquhar, D. Hardoon, H. Meng, J.S. Shawe-taylor, S. Szedmak, Two view learning: Svm-2k, theory and practice, Advances in neural information processing systems (2006) 355–362.

[124]

J. Feng, H. Xu, S. Mannor, S. Yan, Online pca for contaminated data, Advances in Neural Information Processing Systems (2013) 764–772.

[125]

A. Fiat, G. Woeginger, Online algorithms: The state of the art, Springer, Heidelberg, 1998.

[126]

D.H. Fisher, Knowledge acquisition via incremental conceptual clustering, Mach. Learn. 2 (1987) 139–172.

[127]

D. Fotakis, T. Lianeas, G. Piliouras, S. Skoulakis, Efficient online learning of optimal rankings: Dimensionality reduction via gradient descent, in: Advances in Neural Information Processing Systems 33, 2020.

[128]

Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. Syst. Sci. 55 (1997) 119–139.

Digital Library

[129]

Y. Freund, R.E. Schapire, Adaptive game playing using multiplicative weights, Games Econ. Behav. 29 (1999) 79–103.

[130]

Y. Freund, R.E. Schapire, Large margin classification using the perceptron algorithm, Mach. Learn. 37 (1999) 277–296.

[131]

Y. Freund, H.S. Seung, E. Shamir, N. Tishby, Selective sampling using the query by committee algorithm, Mach. Learn. 28 (1997) 133–168.

[132]

D. Gabay, B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Comput. Math. Appl. 2 (1976) 17–40.

[133]

A.A. Gaivoronski, F. Stella, Stochastic nonstationary optimization for finding universal portfolios, Ann. Oper. Res. 100 (2000) 165–188.

[134]

J. Gao, J. Li, Z. Zhang, P.N. Tan, An incremental data stream clustering algorithm based on dense units detection, in: Advances in Knowledge Discovery and Data Mining. Springer, 2005, pp. 420–425.

[135]

W. Gao, R. Jin, S. Zhu, Z.H. Zhou, One-pass auc optimization, in: ICML, 2013.

[136]

X. Gao, S.C. Hoi, Y. Zhang, J. Wan, J. Li, Soml: Sparse online metric learning with application to image retrieval, in: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.

[137]

X. Gao, S.C. Hoi, Y. Zhang, J. Zhou, J. Wan, Z. Chen, J. Li, J. Zhu, Sparse online learning of image similarity, ACM Trans. Intell. Syst. Technol. 8 (2017) 64.

[138]

Y. Gao, Y.F. Li, S. Chandra, L. Khan, B. Thuraisingham, Towards self-adaptive metric learning on the fly, The World Wide Web Conference (2019) 503–513.

[139]

L. Ge, J. Gao, H. Ngo, K. Li, A. Zhang, On handling negative transfer and imbalanced distributions in multiple source transfer learning, Stat. Anal. Data Min.: ASA Data Sci. J. 7 (2014) 254–271.

[140]

L. Ge, J. Gao, A. Zhang, Oms-tl: a framework of online multiple source transfer learning, in: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, ACM, 2013, pp. 2423–2428.

[141]

C. Gentile, A new approximate maximal margin classification algorithm, J. Mach. Learn. Res. 2 (2001) 213–242.

[142]

B. George, Time Series Analysis: Forecasting & Control, Pearson Education India, 1994, p. 3/e.

[143]

P.M. Ghari, Y. Shen, Online multi-kernel learning with graph-structured feedback, International Conference on Machine Learning, PMLR (2020) 3474–3483.

[144]

J. Gittins, K. Glazebrook, R. Weber, Multi-armed bandit allocation indices, John Wiley & Sons, 2011.

[145]

J.C. Gittins, Bandit processes and dynamic allocation indices, J. R. Stat. Soc. Ser. B (1979) 148–177.

[146]

A.B. Goldberg, M. Li, X. Zhu, Online manifold regularization: a new learning setting and empirical study, Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer (2008) 393–407.

[147]

A.B. Goldberg, X. Zhu, A. Furger, J.M. Xu. Oasis: Online active semi-supervised learning, in: AAAI, 2011.

[148]

S. Guha, A. Meyerson, N. Mishra, R. Motwani, L. O’Callaghan, Clustering data streams: Theory and practice, Knowl. Data Eng. IEEE Trans. 15 (2003) 515–528.

[149]

S. Guha, N. Mishra, R. Motwani, L. O’Callaghan, Clustering data streams, in: Foundations of Computer Science, 2000, Proceedings. 41st Annual Symposium on, IEEE, 2000, pp. 359–366.

[150]

M. Gupta, J. Gao, C.C. Aggarwal, J. Han, Outlier detection for temporal data: A survey, IEEE Trans. Knowl. Data Eng. 26 (2014) 2250–2267.

[151]

L. Gyorfi, D. Schafer, Nonparametric prediction, Nato Sci. Ser. Sub Ser. III 190 (2003) 341–356.

[152]

L. Györfi, F. Udina, H. Walk, Nonparametric nearest neighbor based empirical portfolio selection strategies, Stat. Decis. Int. Math. J. Stochastic Methods Models 26 (2008) 145–157.

[153]

B. Han, D. Comaniciu, Y. Zhu, L.S. Davis, Sequential kernel density approximation and its application to real-time visual tracking, IEEE Trans. Pattern Anal. Mach. Intell. 30 (2008) 1186–1197.

[154]

L. Hang, A short introduction to learning to rank, IEICE Trans. Inf. Syst. 94 (2011) 1854–1862.

[155]

J. Hannan, Approximation to bayes risk in repeated play, Contrib. Theory Games 3 (1957) 2.

[156]

S. Hao, S.C. Hoi, C. Miao, P. Zhao, Active crowdsourcing for annotation, in: Web Intelligence and Intelligent Agent Technology (WI-IAT) 2015 IEEE/WIC/ACM International Conference on, 2015, pp. 1–8.

[157]

S. Hao, P. Hu, P. Zhao, S.C. Hoi, C. Miao, Online active learning with expert advice, ACM Trans. Knowl. Discov. Data 12 (2018) 1–22.

[158]

S. Hao, J. Lu, P. Zhao, C. Zhang, S.C. Hoi, C. Miao, Second-order online active learning and its applications, IEEE Trans. Knowl. Data Eng. (2017).

[159]

S. Hao, P. Zhao, S.C. Hoi, C. Miao, Learning relative similarity from data streams: active online learning approaches, in: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, ACM, 2015, pp. 1181–1190.

[160]

S. Hao, P. Zhao, Y. Liu, S.C. Hoi, C. Miao, Online multitask relative similarity learning, International Joint Conference on Artificial Intelligence (2017).

[161]

S. Hao, P. Zhao, J. Lu, S.C. Hoi, C. Miao, C. Zhang, Soal: Second-order online active learning, in: Data Mining (ICDM), 2016 IEEE 16th International Conference on IEEE, 2016, pp. 931–936.

[162]

E.F. Harrington, Online ranking/collaborative filtering using the perceptron algorithm, ICML (2003) 250–257.

[163]

E. Hazan, A. Agarwal, S. Kale, Logarithmic regret algorithms for online convex optimization, Mach. Learn. 69 (2007) 169–192.

[164]

E. Hazan, S. Kale, Newtron: an efficient bandit algorithm for online multiclass prediction, Advances in Neural Information Processing Systems (2011) 891–899.

[165]

E. Hazan, A. Rakhlin, P.L. Bartlett, Adaptive online gradient descent, Advances in Neural Information Processing Systems (2007) 65–72.

[166]

E. Hazan, C. Seshadhri, Efficient learning algorithms for changing environments, in: Proceedings of the 26th annual international conference on machine learning, ACM, 2009, pp. 393–400.

[167]

E. Hazan, et al., Introduction to online convex optimization, Found. Trends Optim. 2 (2016) 157–325.

[168]

R. Heckel, K. Ramchandran, The sample complexity of online one-class collaborative filtering, in: International Conference on Machine Learning, 2017.

[169]

D. Helmbold, S. Panizza, Some label efficient learning results, in: Proceedings of the Tenth Annual Conference on Computational Learning Theory, ACM, 1997, pp. 218–230.

[170]

D.P. Helmbold, R.E. Schapire, Y. Singer, M.K. Warmuth, On-line portfolio selection using multiplicative updates, Math. Finance 8 (1998) 325–347.

[171]

R. Herbrich, T. Graepel, K. Obermayer, Support vector learning for ordinal regression, 1999.

[172]

M. Herbster, S. Pasteris, L. Tse, Online multitask learning with long-term memory, in: Advances in Neural Information Processing Systems 33, 2020.

[173]

S.C. Hoi, J. Wang, P. Zhao, Libol: a library for online learning algorithms, J. Mach. Learn. Res. 15 (2014) 495–499.

[174]

S.C. Hoi, J. Wang, P. Zhao, R. Jin, Online feature selection for mining big data, in: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, ACM, 2012, pp. 93–100.

[175]

S.C.H. Hoi, R. Jin, P. Zhao, T. Yang, Online multiple kernel classification, Mach. Learn. 90 (2013) 289–316.

[176]

P. Honeine, Online kernel principal component analysis: a reduced-order model, IEEE Trans. Pattern Anal. Mach. Intell. (2012) 1814–1826.

[177]

R. Hong, A. Chandra, Dlion: decentralized distributed deep learning in micro-clouds, in: 11th { USENIX } Workshop on Hot Topics in Cloud Computing (HotCloud 19), 2019.

[178]

J. Hu, H. Yang, I. King, M.R. Lyu, A.M.C. So, Kernelized online imbalanced learning with fixed budgets, in: AAAI, 2015, pp. 2666–2672.

[179]

D. Huang, J. Zhou, B. Li, S.C. Hoi, S. Zhou, Robust median reversion strategy for on-line portfolio selection, in: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, AAAI Press, 2013, pp. 2006–2012.

[180]

D. Huang, Y. Zhu, B. Li, S. Zhou, S.C. Hoi, Semi-universal portfolios with transaction costs, in: Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.

[181]

P. Jain, B. Kulis, I.S. Dhillon, K. Grauman, Online metric learning and fast similarity search, Advances in neural information processing systems (2009) 761–768.

[182]

S.I. Jang, Online passive-aggressive total-error-rate minimization, 2020, arXiv preprint arXiv:2002.01771.

[183]

R. Jenatton, J. Huang, C. Archambeau, Adaptive algorithms for online convex optimization with long-term constraints, NIPS (2016).

[184]

R. Jézéquel, P. Gaillard, A. Rudi, Efficient online learning with kernels for adversarial large scale problems, Advances in Neural Information Processing Systems (2019) 9432–9441.

[185]

C. Jia, C. Tan, A. Yong, A grid and density-based clustering algorithm for processing data stream, in: Genetic and Evolutionary Computing, 2008. WGEC’08. Second International Conference on, IEEE, 2008, pp. 517–521.

[186]

L. Jie, F. Orabona, M. Fornoni, B. Caputo, N. Cesa-bianchi, Om-2: An online multi-class multi-kernel learning algorithm, in: Proc. of the 4th IEEE Online Learning for Computer Vision Workshop, 2010.

[187]

R. Jin, S.C.H. Hoi, T. Yang, Online multiple kernel learning: algorithms and mistake bounds, in: Algorithmic Learning Theory, 21st International Conference, ALT 2010, Canberra, Australia, October 6–8, 2010. Proceedings, 2010, pp. 390–404.

[188]

R. Jin, S. Wang, Y. Zhou, Regularized distance metric learning: Theory and algorithm, Advances in neural information processing systems (2009) 862–870.

[189]

D. Johnson, S. Levesque, T. Zhang, Interactive machine learning system for automated annotation of information in text. US Patent App. 10/630,854, 2003.

[190]

K.S. Jun, A. Bhargava, R. Nowak, R. Willett, Scalable generalized linear bandits: Online computation and hashing, Advances in Neural Information Processing Systems, 2017.

[191]

L.P. Kaelbling, M.L. Littman, A.W. Moore, Reinforcement learning: A survey, J. Artif. Intell. Res. (1996) 237–285.

[192]

S.M. Kakade, S. Shalev-Shwartz, A. Tewari, Efficient bandit algorithms for online multiclass prediction, ICML (2008) 440–447.

[193]

S.M. Kakade, A. Tewari, On the generalization ability of online strongly convex programming algorithms, Advances in Neural Information Processing Systems (2009) 801–808.

[194]

A.T. Kalai, S. Vempala, Efficient algorithms for online decision problems, J. Comput. Syst. Sci. 71 (2005) 291–307.

Digital Library

[195]

S. Kale, Z. Karnin, T. Liang, D. Pál, Adaptive feature selection: Computationally efficient online sparse linear regression under rip, in: International Conference on Machine Learning, 2017.

[196]

P. Kar, B.K. Sriperumbudur, P. Jain, H.C. Karnick, On the generalization ability of online learning algorithms for pairwise loss functions, in: ICML, 2013.

[197]

M.N. Katehakis, A.F. Veinott Jr, The multi-armed bandit problem: decomposition and computation, Math. Oper. Res. 12 (1987) 262–268.

[198]

L. Kaufman, P.J. Rousseeuw, Clustering large applications (program clara), Finding groups in data: an introduction to cluster analysis (2008) 126–146.

[199]

E. Kaufmann, O. Cappé, A. Garivier, On bayesian upper confidence bounds for bandit problems, Artificial Intelligence and Statistics (2012) 592–600.

[200]

J.L. Kelly Jr, A new interpretation of information rate, in: The Kelly Capital Growth Investment Criterion: Theory and Practice, World Scientific, 2011, pp. 25–34.

[201]

Z.A. Khan, S. Zubair, N.I. Chaudhary, M.A.Z. Raja, F.A. Khan, N. Dedovic, Design of normalized fractional sgd computing paradigm for recommender systems, Neural Comput. Appl. (2019) 1–18.

[202]

J. Kivinen, A.J. Smola, R.C. Williamson, Online learning with kernels, Signal Processing, IEEE Transactions on 52 (2004) 2165–2176.

[203]

J. Kivinen, M.K. Warmuth, Additive versus exponentiated gradient updates for linear prediction, in: Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing (STOC’95), 1995, pp. 209–218.

[204]

R.D. Kleinberg, Nearly tight bounds for the continuum-armed bandit problem, Advances in Neural Information Processing Systems (2005) 697–704.

[205]

R.D. Kleinberg, Online decision problems with large strategy sets. Ph.D. thesis. Massachusetts Institute of Technology, 2005b.

[206]

M. Kloft, P. Laskov, Security analysis of online centroid anomaly detection, J. Mach. Learn. Res. 13 (2012) 3681–3724.

[207]

A. Kobren, N. Monath, A. Krishnamurthy, A. McCallum, An online hierarchical algorithm for extreme clustering, 2017, arXiv preprint arXiv:1704.01858.

[208]

W.M. Koolen, T. Van Erven, Second-order quantile methods for experts and combinatorial games, Conference on Learning Theory (2015) 1155–1175.

[209]

Y. Koren, R. Bell, C. Volinsky, Matrix factorization techniques for recommender systems, Computer (2009) 30–37.

Digital Library

[210]

P. Kranen, I. Assent, C. Baldauf, T. Seidl, The clustree: indexing micro-clusters for anytime stream mining, Knowledge and information systems 29 (2011) 249–272.

[211]

W. Krauth, M. Mézard, Learning algorithms with optimal stability in neural networks, J. Phys. A: Math. Gen. 20 (1987) L745.

[212]

M. Kristan, A. Leonardis, D. Skocaj, Multivariate online kernel density estimation with gaussian kernels, Pattern Recogn. 44 (2011) 2630–2642.

[213]

S.I. Ktena, A. Tejani, L. Theis, P.K. Myana, D. Dilipkumar, F. Huszár, S. Yoo, W. Shi, Addressing delayed feedback for continuous training with neural networks in ctr prediction, in: Proceedings of the 13th ACM Conference on Recommender Systems, 2019, pp. 187–195.

[214]

A. Kumar, H. Daumé III, Learning task grouping and overlap in multi-task learning, in: Proceedings of the 29th International Conference on Machine Learning Omnipress, 2012, pp. 1723–1730.

[215]

D. Kuzmin, M.K. Warmuth, Online kernel pca with entropic matrix updates, in: Proceedings of the 24th international conference on Machine learning, ACM, 2007, pp. 465–472.

[216]

T.L. Lai, H. Robbins, Asymptotically efficient adaptive allocation rules, Adv. Appl. Math. 6 (1985) 4–22.

[217]

J. Langford, L. Li, A. Strehl, Vowpal wabbit online learning project, 2007.

[218]

J. Langford, L. Li, T. Zhang, Sparse online learning via truncated gradient, J. Mach. Learn. Res. 10 (2009) 777–801.

[219]

J. Langford, T. Zhang, The epoch-greedy algorithm for multi-armed bandits with side information, NIPS (2008) 817–824.

[220]

M.H. Law, A.K. Jain, Incremental nonlinear dimensionality reduction by manifold learning, IEEE Trans. Pattern Anal. Mach. Intell. 28 (2006) 377–391.

[221]

T. Le, T. Nguyen, V. Nguyen, D. Phung, Dual space gradient descent for online learning, Advances In Neural Information Processing Systems (2016) 4583–4591.

[222]

Y.A. LeCun, L. Bottou, G.B. Orr, K.R. Müller, Efficient backprop, in: Neural Networks: Tricks of the Trade, Springer, 1998, pp. 9–48.

[223]

K.Y. Levy, Online to offline conversions and adaptive minibatch sizes, in: Advances in Neural Information Processing Systems, 2017.

[224]

B. Li, Online portfolio selection, Ph.D. thesis, Nanyang Technological University, 2013.

[225]

B. Li, S.C. Hoi, On-line portfolio selection with moving average reversion, 2012, arXiv preprint arXiv:1206.4626.

[226]

B. Li, S.C. Hoi, Online portfolio selection: A survey, ACM Comput. Surveys 46 (2014) 35.

[227]

B. Li, S.C. Hoi, V. Gopalkrishnan, Corn: Correlation-driven nonparametric learning approach for portfolio selection, ACM Trans. Intell. Syst. Technol. 2 (2011) 21.

[228]

B. Li, S.C. Hoi, D. Sahoo, Z.Y. Liu, Moving average reversion strategy for on-line portfolio selection, Artif. Intell. 222 (2015) 104–123.

[229]

B. Li, S.C. Hoi, P. Zhao, V. Gopalkrishnan, Confidence weighted mean reversion strategy for on-line portfolio selection, in: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 434–442.

[230]

B. Li, S.C. Hoi, P. Zhao, V. Gopalkrishnan, Confidence weighted mean reversion strategy for online portfolio selection, ACM Transactions on Knowledge Discovery from Data (TKDD) 7 (2013) 4.

[231]

B. Li, S.C.H. Hoi, Online Portfolio Selection: Principles and Algorithms, Crc Press, 2015.

[232]

B. Li, D. Sahoo, S.C. Hoi, Olps: a toolbox for on-line portfolio selection, J. Mach. Learn. Res. 17 (2016) 1–5.

[233]

B. Li, J. Wang, D. Huang, S.C. Hoi, Transaction cost optimization for online portfolio selection, Quantitative Finance (2017) 1–14.

[234]

B. Li, P. Zhao, S.C. Hoi, V. Gopalkrishnan, Pamr: Passive aggressive mean reversion strategy for portfolio selection, Mach. Learn. 87 (2012) 221–258.

[235]

C.J. Li, Z. Wang, H. Liu, Online ica: Understanding global dynamics of nonconvex optimization via diffusion processes, Advances in Neural Information Processing Systems (2016) 4967–4975.

[236]

G. Li, S.C. Hoi, K. Chang, R. Jain, Micro-blogging sentiment detection by collaborative online learning, IEEE Intl. Conference on Data Mining, IEEE (2010) 893–898.

[237]

G. Li, S.C. Hoi, K. Chang, W. Liu, R. Jain, Collaborative online multitask learning, IEEE Trans. Knowl. Data Eng. 26 (2014) 1866–1876.

[238]

G. Li, Y. Shen, P. Zhao, X. Lu, J. Liu, Y. Liu, S.C. Hoi, Detecting cyberattacks in industrial control systems using online learning algorithms, Neurocomputing 364 (2019) 338–348.

Digital Library

[239]

G. Li, P. Zhao, X. Lu, J. Liu, Y. Shen, Data analytics for fog computing by distributed online learning with asynchronous update, in: ICC 2019-2019 IEEE International Conference on Communications (ICC), IEEE, 2019b. pp. 1–6.

[240]

K. Li, X. Zhou, F. Lin, W. Zeng, G. Alterovitz, Deep probabilistic matrix factorization framework for online collaborative filtering, IEEE Access 7 (2019) 56117–56128.

[241]

K. Li, X. Zhou, F. Lin, W. Zeng, B. Wang, G. Alterovitz, Sparse online collaborative filtering with dynamic regularization, Inf. Sci. 505 (2019) 535–548.

[242]

L. Li, W. Chu, J. Langford, R.E. Schapire, A contextual-bandit approach to personalized news article recommendation, in: Proceedings of the 19th international conference on World wide web ACM, 2010, pp. 661–670.

[243]

L. Li, Y. Lu, D. Zhou, Provable optimal algorithms for generalized linear contextual bandits, in: International Conference on Machine Learning, 2017.

[244]

Y. Li, P.M. Long, The relaxed online maximum margin algorithm, Mach. Learn. 46 (2002) 361–387.

[245]

Y. Li, M. Yang, Z. Zhang, Multi-view representation learning: A survey from shallow methods to deep methods, 2016c, arXiv preprint arXiv:1610.01206.

[246]

Y. Li, H. Zaragoza, R. Herbrich, J. Shawe-Taylor, J. Kandola, The perceptron algorithm with uneven margins, ICML (2002) 379–386.

[247]

L. Li-xiong, K. Jing, G. Yun-fei, H. Hai, A three-step clustering algorithm over an evolving data stream, in: Intelligent Computing and Intelligent Systems, 2009, ICIS 2009, IEEE International Conference on, IEEE, 2009, pp. 160–164.

[248]

N.Y. Liang, G.B. Huang, P. Saratchandran, N. Sundararajan, A fast and accurate online sequential learning algorithm for feedforward networks, Neural Netw. IEEE Trans. 17 (2006) 1411–1423.

[249]

K.P. Lin, M.S. Chen, Efficient kernel approximation for large-scale support vector machine classification, in: Proceedings of the Eleventh SIAM International Conference on Data Mining SIAM, 2011, pp. 211–222.

[250]

X. Lin, W. Zhang, M. Zhang, W. Zhu, J. Pei, P. Zhao, J. Huang, Online compact convexified factorization machine, in: Proceedings of the 2018 World Wide Web Conference, 2018, pp. 1633–1642.

[251]

G. Ling, H. Yang, I. King, M.R. Lyu, Online learning for collaborative filtering, in: Neural Networks (IJCNN), The 2012 International Joint Conference on IEEE, 2012, pp. 1–8.

[252]

N. Littlestone, Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm, Mach. Learn. 2 (1988) 285–318.

[253]

N. Littlestone. From on-line to batch learning, in: Proceedings of the Second Annual Workshop on Computational Learning Theory, COLT 1989, Santa Cruz, CA, USA, July 31 - August 2, 1989, pp. 269–284.

[254]

N. Littlestone, M.K. Warmuth, The weighted majority algorithm, in: 30th Annual Symposium on Foundations of Computer Science, 1989, pp. 256–261.

[255]

N. Littlestone, M.K. Warmuth, The weighted majority algorithm, Inf. Comput. 108 (1994) 212–261.

[256]

C. Liu, S.C. Hoi, P. Zhao, J. Sun, Online arima algorithms for time series prediction, 2016a.

[257]

C. Liu, S.C. Hoi, P. Zhao, J. Sun, E.P. Lim, Online adaptive passive-aggressive methods for non-negative matrix factorization and its applications, in: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, 2016, pp. 1161–1170.

[258]

C. Liu, T. Jin, S.C. Hoi, P. Zhao, J. Sun, Collaborative topic regression for online recommender systems: an online and bayesian approach, Mach. Learn. 106 (2017) 651–670.

[259]

N.N. Liu, M. Zhao, E. Xiang, Q. Yang, Online evolutionary collaborative filtering, in: Proceedings of 4th ACM conference on Recommender systems, 2010, pp. 95–102.

[260]

Y.W. Liyanage, D.S. Zois, C. Chelmis, On-the-fly joint feature selection and classification, 2020, arXiv preprint arXiv:2004.10245.

[261]

J. Lu, S. Hoi, J. Wang, Second order online collaborative filtering, Asian Conference on Machine Learning (2013) 325–340.

[262]

J. Lu, S.C. Hoi, J. Wang, P. Zhao, Z.Y. Liu, Large scale online kernel learning, J. Mach. Learn. Res. (2015).

[263]

J. Lu, D. Sahoo, P. Zhao, S.C. Hoi, Sparse passive-aggressive learning for bounded online kernel methods, ACM Trans. Intell. Syst. Technol. 9 (2018) 45.

[264]

J. Lu, P. Zhao, S.C. Hoi, Online passive aggressive active learning and its applications, in: The 6th Asian Conference on Machine Learning (ACML2014), 2014.

[265]

J. Lu, P. Zhao, S.C. Hoi, Online passive-aggressive active learning, Mach. Learn. 103 (2016) 141–183.

[266]

J. Lu, P. Zhao, S.C. Hoi, Online sparse passive aggressive learning with kernels, in: Proceedings of the 2016 SIAM International Conference on Data Mining SIAM, 2016, pp. 675–683.

[267]

H. Luo, A. Agarwal, N. Cesa-Bianchi, J. Langford, Efficient second order online learning by sketching, Advances in Neural Information Processing Systems (2016) 902–910.

[268]

H. Luo, R.E. Schapire, Achieving all with no parameters: Adanormalhedge, Conference on Learning Theory (2015) 1286–1304.

[269]

S. Magureanu, R. Combes, A. Proutière, Lipschitz bandits: regret lower bound and optimal algorithms, in: Proceedings of The 27th Conference on Learning Theory, COLT 2014, Barcelona, Spain, June 13–15, 2014, pp. 975–999.

[270]

A.F.T. Martins, N.A. Smith, E.P. Xing, P.M.Q. Aguiar, M.A.T. Figueiredo, Online learning of structured predictors with multiple kernels, J. Mach. Learn. Res. 15 (2011) 507–515.

[271]

B.C. May, N. Korda, A. Lee, D.S. Leslie, Optimistic bayesian sampling in contextual-bandit problems, J. Mach. Learn. Res. 13 (2012) 2069–2106.

[272]

H.B. McMahan, M.J. Streeter, Tighter bounds for multi-armed bandits with expert advice, in: COLT, 2009.

[273]

A. Mejer, K. Crammer, Confidence in structured-prediction using confidence-weighted models, in: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2010, pp. 971–981.

[274]

R.S. Michalski, I. Mozetic, J. Hong, N. Lavrac, The multi-purpose incremental learning system aq15 and its testing application to three medical domains, Proc. AAAI 1986 (1986) 1–041.

[275]

I. Mitliagkas, C. Caramanis, P. Jain, Memory limited, streaming pca, Advances in Neural Information Processing Systems (2013) 2886–2894.

[276]

J.F. Mota, J.M. Xavier, P.M. Aguiar, M. Püschel, D-admm: A communication-efficient distributed algorithm for separable optimization, IEEE Trans. Signal Process. 61 (2013) 2718–2723.

[277]

M. Mundt, Y.W. Hong, I. Pliushch, V. Ramesh, A wholistic view of continual learning with deep neural networks, in: Forgotten lessons and the bridge to active and open world learning, 2020, arXiv preprint arXiv:2009.01797.

[278]

K. Murugesan, H. Liu, J. Carbonell, Y. Yang, Adaptive smoothed online multi-task learning, Advances in Neural Information Processing Systems (2016) 4296–4304.

[279]

Y. Nesterov, Primal-dual subgradient methods for convex problems, Mathematical programming 120 (2009) 221–259.

[280]

T.D. Nguyen, T. Le, H. Bui, D. Phung, Large-scale online kernel learning with random feature reparameterization, in: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI-17), 2017, pp. 2543–2549.

[281]

T.T. Nguyen, K. Chang, S.C. Hui, Two-view online learning, Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer (2012) 74–85.

[282]

X. Nie, M. Fan, X. Huang, W. Yang, B. Zhang, X. Ma, Online semisupervised active classification for multiview polsar data, IEEE Trans. Cybern. (2020).

[283]

H. Ning, J. Zhang, T.T. Feng, E.K.w. Chu, T. Tian, Control-based algorithms for high dimensional online learning. Journal of the Franklin Institute 357 (2020) 1909–1942.

[284]

N. Nisan, T. Roughgarden, E. Tardos, V.V. Vazirani, Algorithmic game theory, vol. 1, Cambridge University Press Cambridge, 2007.

[285]

A.B. Novikoff, On convergence proofs for perceptrons, (Technical Report) STANFORD RESEARCH INST, Menlo Park CA, 1963.

[286]

I. Ntoutsi, A. Zimek, T. Palpanas, P. Kröger, H.P. Kriegel, Density-based projected clustering over high dimensional data streams., in: SDM, SIAM, 2012, pp. 987–998.

[287]

L. O’callaghan, A. Meyerson, R. Motwani, N. Mishra, S. Guha, Streaming-data algorithms for high-quality clustering, in: IEEE 29th International Conference on Data Engineering (ICDE), 2002, pp. 0685–0685.

[288]

F. Orabona, N. Cesa-Bianchi, Better algorithms for selective sampling, in: Proc. 28th International Conference on Machine Learning (ICML2011), 2011, pp. 433–440.

[289]

F. Orabona, K. Crammer, New adaptive algorithms for online classification, Advances in neural information processing systems (2010) 1840–1848.

[290]

F. Orabona, J. Keshet, B. Caputo, Bounded kernel-based online learning, J. Mach. Learn. Res. 10 (2009) 2643–2666.

[291]

M. Ormos, A. Urbán, Performance analysis of log-optimal portfolio strategies with transaction costs, Quantitative Finance 13 (2013) 1587–1597.

[292]

S.J. Pan, Q. Yang, A survey on transfer learning, Knowledge and Data Engineering, IEEE Transactions on 22 (2010) 1345–1359.

Digital Library

[293]

G.I. Parisi, R. Kemker, J.L. Part, C. Kanan, S. Wermter, Continual lifelong learning with neural networks: a review, 2018, arXiv preprint arXiv:1802.07569.

[294]

Q. Pham, D. Sahoo, C. Liu, S.C. Hoi, Bilevel continual learning, 2020, arXiv preprint arXiv:2007.15553.

[295]

J. Platt, A resource-allocating network for function interpolation, Neural computation 3 (1991) 213–225.

[296]

J. Platt et al., Fast training of support vector machines using sequential minimal optimization. Advances in kernel methods support vector learning 3, 1999.

[297]

G.C. Poggio, Incremental and decremental support vector machine learning, in: Advances in Neural Information Processing Systems 13 Proceedings of the 2000 Conference, MIT Press, 2001, p. 409.

[298]

R. Polikar, L. Upda, S.S. Upda, V. Honavar, Learn++: An incremental learning algorithm for supervised neural networks, Syst. Man Cybern. Part C: Appl. Rev. IEEE Trans. 31 (2001) 497–508.

[299]

M. Pratama, W. Pedrycz, G.I. Webb, An incremental construction of deep neuro fuzzy system for continual learning of non-stationary data streams, IEEE Trans. Fuzzy Syst. (2019).

Digital Library

[300]

M. Pratama, D. Wang, Deep stacked stochastic configuration networks for lifelong learning of non-stationary data streams, Inf. Sci. 495 (2019) 150–174.

Digital Library

[301]

M. Pratama, C. Za’in, A. Ashfahani, Y.S. Ong, W. Ding, Automatic construction of multi-layer perceptron network from streaming examples, in: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019, pp. 1171–1180.

[302]

A. Qahtan, S. Wang, X. Zhang, Kde-track: An efficient dynamic density estimator for data streams, IEEE Trans. Knowl. Data Eng. 29 (2017) 642–655.

[303]

A. Rahimi, B. Recht, Random features for large-scale kernel machines, Advances in neural information processing systems (2007) 1177–1184.

[304]

A. Rakhlin, Lecture notes on online learning. Notes appeared in the Statistical Learning Theory course at UC Berkeley, 2008.

[305]

A. Rakhlin, K. Sridharan, A. Tewari, Online learning: Random averages, combinatorial parameters, and learnability, Advances in Neural Information Processing Systems (2010) 1984–1992.

[306]

A. Rakotomamonjy, F.R. Bach, S. Canu, Y. Grandvalet, Simplemkl. J. Mach. Learn. Res. (JMLR) 11 (2008) 2491–2521.

[307]

J. Read, A. Bifet, B. Pfahringer, G. Holmes, Batch-incremental versus instance-incremental learning in dynamic and evolving data, in: Advances in Intelligent Data Analysis XI. Springer, 2012, pp. 313–323.

[308]

J. Ren, B. Cai, C. Hu, Clustering over data streams based on grid density and index tree, Journal of Convergence Information Technology 6 (2011).

[309]

J. Ren, R. Ma, Density-based data streams clustering over sliding windows, in: Fuzzy Systems and Knowledge Discovery, 2009. FSKD’09. Sixth International Conference on, IEEE, 2009, pp. 248–252.

[310]

H. Robbins, Some aspects of the sequential design of experiments, Herbert Robbins Selected Papers. Springer (1985) 169–177.

[311]

F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain, Psychological review 65 (1958) 386.

[312]

D. Roth, K. Small, I. Titov, Sequential learning of classifiers for structured prediction problems, International Conference on Artificial Intelligence and Statistics (2009) 440–447.

[313]

T. Roughgarden, O. Schrijvers, Online prediction with selfish experts, in: Advances In Neural Information Processing Systems, 2017.

[314]

S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (2000) 2323–2326.

[315]

C. Ruiz, E. Menasalvas, M. Spiliopoulou, C-denstream: Using domain knowledge on a data stream, Discovery Science, Springer. (2009) 287–301.

[316]

P. Rusmevichientong, J.N. Tsitsiklis, Linearly parameterized bandits, Mathematics of Operations Research 35 (2010) 395–411.

[317]

D. Russo, B. Van Roy, An information-theoretic analysis of thompson sampling, J. Mach. Learn. Res. 17 (2016) 2442–2471.

[318]

P. Ruvolo, E. Eaton, Ella: An efficient lifelong learning algorithm, International Conference on Machine Learning (2013) 507–515.

[319]

A. Saha, P. Rai, H. DaumÃ, S. Venkatasubramanian, et al., Online learning of multiple tasks and their relationships, in: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 643–651.

[320]

D. Sahoo, S.C. Hoi, B. Li, Online multiple kernel regression, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining ACM, 2014, pp. 293–302.

[321]

D. Sahoo, S.C. Hoi, B. Li, Large scale online multiple kernel regression with application to time-series prediction, ACM Transactions on Knowledge Discovery from Data (TKDD) 13 (2019) 1–33.

[322]

D. Sahoo, Q. Pham, J. Lu, S.C.H. Hoi, Online deep learning: Learning deep neural networks on the fly, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence IJCAI-18, 2018, pp. 2660–2666.

[323]

D. Sahoo, A. Sharma, S.C. Hoi, P. Zhao, Temporal kernel descriptors for learning with time-sensitive patterns, in: Proceedings of the 2016 SIAM International Conference on Data Mining SIAM, 2016, pp. 540–548.

[324]

D. Sahoo, P. Zhao, S.C. Hoi, Cost-sensitive online multiple kernel classification, in: Proceedings of The 8th Asian Conference on Machine Learning, 2016, pp. 65–80.

[325]

N.I. Sapankevych, R. Sankar, Time series prediction using support vector machines: a survey, Computational Intelligence Magazine, IEEE 4 (2009) 24–38.

[326]

B. Schölkopf, R. Herbrich, A.J. Smola, A generalized representer theorem, COLT/EuroCOLT (2001) 416–426.

[327]

S. Schuon, M. Durković, K. Diepold, J. Scheuerle, S. Markward, Truly incremental locally linear embedding, in: CoTeSys 1st International Workshop on Cognition for Technical Systems, 2008.

[328]

D.W. Scott, Multivariate density estimation: theory, practice, and visualization, John Wiley & Sons, 2015.

[329]

S.L. Scott, A modern bayesian look at the multi-armed bandit, Applied Stochastic Models in Business and Industry 26 (2010) 639–658.

[330]

H.S. Seung, M. Opper, H. Sompolinsky, Query by committee, in: Proc, in: 5th annual workshop on Computational learning theory, 1992, pp. 287–294.

[331]

P. Shah, A. Soni, T. Chevalier, Online ranking with constraints: A primal-dual algorithm and applications to web traffic-shaping, in: KDD, 2017.

[332]

S. Shalev-Shwartz, Online learning: theory, algorithms, and applications. Ph.D. thesis. The Hebrew University of Jerusalem, 2007

[333]

S. Shalev-Shwartz, Online learning and online convex optimization, Foundations and Trends in Machine Learning 4 (2011) 107–194.

[334]

S. Shalev-Shwartz, Y. Singer, A primal-dual perspective of online learning algorithms, Mach. Learn. 69 (2007) 115–142.

[335]

S. Shalev-Shwartz, Y. Singer, A.Y. Ng, Online and batch learning of pseudo-metrics, in: Proceedings of the twenty-first international conference on Machine learning ACM, 2004, p. 94.

[336]

S. Shalev-Shwartz, Y. Singer, N. Srebro, A. Cotter, Pegasos: Primal estimated sub-gradient solver for svm, Mathematical programming 127 (2011) 3–30.

[337]

S. Shalev-Shwartz, A. Tewari, Stochastic methods for l 1-regularized loss minimization, J. Mach. Learn. Res. (2011) 1865–1892.

[338]

Y. Shi, M. Larson, A. Hanjalic, Collaborative filtering beyond the user-item matrix: A survey of the state of the art and future challenges, ACM Computing Surveys (CSUR) 47 (2014) 3.

[339]

D.L. Silver, Q. Yang, L. Li, Lifelong machine learning systems: Beyond learning algorithms., in: AAAI Spring Symposium: Lifelong Machine Learning, 2013, p. 05.

[340]

B.W. Silverman, Density estimation for statistics and data analysis, Routledge, 2018.

[341]

P. Smyth, M. Welling, A.U. Asuncion, Asynchronous distributed learning of topic models, NIPS (2009) 81–88.

[342]

S. Sonnenburg, V. Franc, Coffin: a computational framework for linear svms, in: Proceedings of the 27th International Conference on International Conference on Machine Learning Omnipress, 2010, pp. 999–1006.

[343]

S. Sonnenburg, G. Rätsch, C. Schäfer, B. Schölkopf, Large scale multiple kernel learning, J. Mach. Learn. Res. (JMLR) 7 (2006) 1531–1565.

[344]

R. Sousa, L.M. Silva, L.A. Alexandre, J. Santos, J.M. de Sá, Transfer learning: Current status, trends and challenges.

[345]

E.J. Spinosa, F. de Leon, A. Ponce, J. Gama, Novelty detection with application to data streams, Intelligent Data Analysis 13 (2009) 405–422.

[346]

X. Su, T.M. Khoshgoftaar, A survey of collaborative filtering techniques, Advances in artificial intelligence 2009 (2009) 4.

[347]

S. Sun, A survey of multi-view machine learning, Neural Comput. Appl. 23 (2013) 2031–2038.

[348]

R.S. Sutton, A.G. Barto, Reinforcement learning: An introduction, MIT press, 1998.

Digital Library

[349]

M. Takada, H. Fujisawa, Transfer learning via l1 regularization. Advances in Neural Information Processing Systems 33 (2020).

[350]

S.C. Tan, K.M. Ting, T.F. Liu, Fast anomaly detection for streaming data, in: IJCAI Proceedings-International Joint Conference on Artificial Intelligence, 2011, p. 1511.

[351]

Y. Tao, S. Lu, From online to non-iid batch learning, in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 328–337.

[352]

D.K. Tasoulis, G. Ross, N.M. Adams, Visualising the cluster structure of data streams, in: Advances in Intelligent Data Analysis VII, Springer, 2007, pp. 81–92.

[353]

J.B. Tenenbaum, V. De Silva, J.C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science 290 (2000) 2319–2323.

[354]

W.R. Thompson, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika 25 (1933) 285–294.

[355]

T. Tommasi, F. Orabona, M. Kaboli, B. Caputo, C. Martigny, Leveraging over prior knowledge for online learning of visual categories, in: BMVC, 2012.

[356]

A. Trotman, Learning to rank, Inf. Retrieval 8 (2005) 359–381.

[357]

P. Tseng, On accelerated proximal gradient methods for Convex-Concave optimization, SIAM Journal on (2008) Optimization.

[358]

L. Tu, Y. Chen, Stream data clustering based on grid density and attraction, ACM Transactions on Knowledge Discovery from Data (TKDD) 3 (2009) 12.

[359]

Q. Tu, J. Lu, B. Yuan, J. Tang, J.Y. Yang, Density-based hierarchical clustering for streaming data, Pattern Recogn. Lett. 33 (2012) 641–645.

[360]

T. Uchiya, A. Nakamura, M. Kudo, Algorithms for adversarial bandit problems with multiple plays, International Conference on Algorithmic Learning Theory, Springer. (2010) 375–389.

[361]

M. Valko, B. Kveton, H. Ling, T. Daniel, Online semi-supervised learning on quantized graphs, Uncertainty in Artificial Intelligence, in, 2010.

[362]

V.N. Vapnik, An overview of statistical learning theory, IEEE transactions on neural networks 10 (1999) 988–999.

Digital Library

[363]

V.N. Vapnik, V. Vapnik, Statistical learning theory, volume 1, Wiley, New York, 1998.

[364]

J. Vermorel, M. Mohri, Multi-armed bandit algorithms and empirical evaluation, in: Machine Learning: ECML 2005. Springer, 2005, pp. 437–448.

[365]

V. Vovk, C. Watkins, Universal portfolio selection, in: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, ACM, 1998, pp. 12–23.

[366]

J. Wan, P. Wu, S.C. Hoi, P. Zhao, X. Gao, D. Wang, Y. Zhang, J. Li, Online learning to rank for content-based image retrieval, in: IJCAI, 2015, pp. 2284–2290.

[367]

L. Wan, W.K. Ng, X.H. Dang, P.S. Yu, K. Zhang, Density-based clustering of data streams at multiple resolutions, ACM Transactions on Knowledge Discovery from Data (TKDD) 3 (2009) 14.

[368]

C. Wang, Y. Lu, The scaling limit of high-dimensional online independent component analysis, Advances in Neural Information Processing Systems (2017) 6638–6647.

[369]

D. Wang, P. Wu, P. Zhao, S.C. Hoi, A framework of sparse online learning and its applications, 2015a, arXiv preprint arXiv:1507.07146.

[370]

D. Wang, P. Wu, P. Zhao, Y. Wu, C. Miao, S.C. Hoi, High-dimensional data stream classification via sparse online learning, in: Data Mining (ICDM), 2014 IEEE International Conference on IEEE, 2014, pp. 1007–1012.

[371]

H. Wang, A. Banerjee, Online alternating direction method, in: Proceedings of the 29th International Conference on Machine Learning, ICML 2012 June 26 - July 1, 2012. Edinburgh, Scotland, UK, 2012.

[372]

H. Wang, W. Fan, P.S. Yu, J. Han, Mining concept-drifting data streams using ensemble classifiers, in: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining ACM, 2003, pp. 226–235.

[373]

J. Wang, S.C. Hoi, P. Zhao, Z.Y. Liu, Online multi-task collaborative filtering for on-the-fly recommender systems, in: Proceedings of the 7th ACM conference on Recommender systems ACM, 2013, pp. 237–244.

[374]

J. Wang, S.C. Hoi, P. Zhao, J. Zhuang, Z.y. Liu, Large scale online kernel classification, in: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, AAAI Press, 2013b, pp. 1750–1756.

[375]

J. Wang, J. Wan, Y. Zhang, S.C. Hoi, Solar: Scalable Online Learning Algorithms for Ranking, ACL, 2015b.

[376]

J. Wang, P. Zhao, S.C. Hoi, Exact soft confidence-weighted learning, in: Proceedings of the 29th International Coference on International Conference on Machine Learning Omnipress, 2012, pp. 107–114.

[377]

J. Wang, P. Zhao, S.C. Hoi, Cost-sensitive online classification, Knowledge and Data Engineering, IEEE Transactions on 26 (2014) 2425–2438.

[378]

J. Wang, P. Zhao, S.C. Hoi, Soft confidence-weighted learning, ACM Transactions on Intelligent Systems and Technology (TIST) 8 (2016) 15.

[379]

J. Wang, P. Zhao, S.C. Hoi, R. Jin, Online feature selection and its applications, IEEE Trans. Knowl. Data Eng. 26 (2014) 698–710.

Digital Library

[380]

J. Wang, P. Zhao, S.C.H. Hoi, Cost-sensitive online classification, in: 12th IEEE International Conference on Data Mining (ICDM2012), 2012b, pp. 1140–1145.

[381]

S. Wang, R. Jin, H. Valizadegan, A potential-based framework for online multi-class learning with partial feedback, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 900–907.

[382]

S. Wang, L.L. Minku, X. Yao, Dealing with multiple classes in online class imbalance learning, in: International Joint Conferences on Artificial Intelligence, 2016.

[383]

Y. Wang, J.Y. Audibert, R. Munos, Algorithms for infinitely many-armed bandits, Advances in Neural Information Processing Systems (2009) 1729–1736.

[384]

Y. Wang, Z. Jiang, X. Chen, P. Xu, Y. Zhao, Y. Lin, Z. Wang, E2-train: Training state-of-the-art cnns with over 80% energy savings, Advances in Neural Information Processing Systems (2019) 5138–5150.

[385]

Y. Wang, R. Khardon, D. Pechyony, R. Jones, Generalization bounds for online learning algorithms with pairwise loss functions, in: COLT, 2012c, pp. 13–1.

[386]

Z. Wang, K. Crammer, S. Vucetic, Breaking the curse of kernelization: Budgeted stochastic gradient descent for large-scale svm training, J. Mach. Learn. Res. 13 (2012) 3103–3131.

[387]

Z. Wang, S. Vucetic, Tighter perceptron with improved dual use of cached data for model representation and validation, in: Neural Networks, 2009. IJCNN 2009. International Joint Conference on, IEEE, 2009, pp. 3297–3302.

[388]

Z. Wang, S. Vucetic, Online passive-aggressive algorithms on a budget, Journal of Machine Learning Research - Proceedings Track 9 (2010) 908–915.

[389]

M. Ware, E. Frank, G. Holmes, M. Hall, I.H. Witten, Interactive machine learning: letting users build classifiers, Int. J. Hum Comput Stud. 55 (2001) 281–292.

[390]

M.K. Warmuth, D. Kuzmin, Randomized online pca algorithms with regret bounds that are logarithmic in the dimension, J. Mach. Learn. Res. 9 (2008).

[391]

J. Weston, A. Bordes, L. Bottou, et al., Online (and offline) on an even tighter budget, in: Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, 2005, pp. 413–420.

[392]

C.K.I. Williams, M. Seeger, Using the nyström method to speed up kernel machines, in: T. Leen, T. Dietterich, V. Tresp (Eds.), Advances in Neural Information Processing Systems 13, MIT Press, 2001, pp. 682–688.

[393]

R.J. Williams, D. Zipser, A learning algorithm for continually running fully recurrent neural networks, Neural computation 1 (1989) 270–280.

Digital Library

[394]

C.H. Wu, H.H.S. Lu, H.M. Hang, Budgeted passive-aggressive learning for online multiclass classification. IEEE, Access. (2020).

[395]

P. Wu, S.C. Hoi, H. Xia, P. Zhao, D. Wang, C. Miao, Online multimodal deep similarity learning with application to image retrieval, in: Proceedings of 21st ACM international conference on Multimedia, 2013, pp. 153–162.

[396]

P. Wu, S.C. Hoi, P. Zhao, C. Miao, Z.Y. Liu, Online multi-modal distance metric learning with application to image retrieval, IEEE Trans. Knowl. Data Eng. 28 (2016) 454–467.

[397]

Y. Wu, S.C. Hoi, C. Liu, J. Lu, D. Sahoo, N. Yu, Sol: A library for scalable online learning algorithms, Neurocomputing 260 (2017) 9–12.

Digital Library

[398]

Y. Wu, S.C. Hoi, T. Mei, Massive-scale online feature selection for sparse ultra-high dimensional data, 2014, arXiv preprint arXiv:1409.7794.

[399]

Y. Wu, S.C. Hoi, T. Mei, N. Yu, Large-scale online feature selection for ultra-high dimensional sparse data, ACM Transactions on Knowledge Discovery from Data (TKDD) 11 (2017) 48.

[400]

H. Xia, S.C. Hoi, R. Jin, P. Zhao, Online multiple kernel similarity learning for visual search, IEEE Trans. Pattern Anal. Mach. Intell. 36 (2014) 536–549.

[401]

H. Xia, P. Wu, S.C. Hoi, Online multi-modal distance learning for scalable multimedia retrieval, in: Proceedings of the sixth ACM international conference on Web search and data mining ACM, 2013, pp. 455–464.

[402]

L. Xiao, Dual averaging method for regularized stochastic learning and online optimization, Advances in Neural Information Processing Systems (2009) 2116–2124.

[403]

C. Xu, D. Tao, C. Xu, A survey on multi-view learning, 2013, arXiv preprint arXiv:1304.5634.

[404]

K. Xu, Y. Li, R. Deng, K. Chen, J. Xu, Droidevolver: Self-evolving android malware detection system, in: 2019 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2019, pp. 47–62.

[405]

Z. Xu, R. Jin, I. King, M.R. Lyu, An extended level method for efficient multiple kernel learning, in: NIPS, 2008.

[406]

D. Yang, E.A. Rundensteiner, M.O. Ward, Neighbor-based pattern detection for windows over streaming data, in: Proceedings of the 12th International Conference on Extending Database Technology (EDBT), 2009, pp. 529–540.

[407]

H. Yang, I. King, M.R. Lyu, Online learning for multi-task feature selection, in: Proceedings of the 19th ACM international conference on Information and knowledge management ACM, 2010, pp. 1693–1696.

[408]

L. Yang, R. Jin, Distance metric learning: A comprehensive survey, Michigan State Universiy 2 (2006) 4.

[409]

L. Yang, R. Jin, J. Ye, Online learning by ellipsoid method, in: Proceedings of 26th International Conference on Machine Learning, 2009, pp. 1153–1160.

[410]

P. Yang, P. Zhao, X. Gao, Bandit online learning on graphs via adaptive optimization, International Joint Conferences on Artificial (2018) Intelligence.

[411]

P. Yang, P. Zhao, J. Zhou, X. Gao, Confidence weighted multitask learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, 2019, pp. 5636–5643.

[412]

T. Yang, M. Mahdavi, R. Jin, J. Yi, S.C. Hoi, AAAI (Ed.), Online kernel selection: Algorithms and evaluations, 2012.

[413]

Y. Yang, D.W. Zhou, D.C. Zhan, H. Xiong, Y. Jiang, Adaptive deep models for incremental learning: Considering capacity scalability and sustainability, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 74–82.

[414]

Y. Ying, L. Wen, S. Lyu, Stochastic online auc maximization, Advances in Neural Information Processing Systems (2016) 451–459.

[415]

L. Yuan-Xiang, L. Zhi-Jie, W. Feng, K. Li, Accelerated online learning for collaborative filtering and recommender systems, in: Data Mining Workshop (ICDMW), 2014 IEEE International Conference on IEEE, 2014, pp. 879–885.

[416]

C. Zeng, Q. Wang, S. Mokhtari, T. Li, Online context-aware recommendation with time varying multi-arm bandit, KDD. (2016).

[417]

C. Zhang, Online federated learning over decentralized networks. Ph.D. thesis, 2018.

[418]

C. Zhang, S.C. Hoi, Partially observable multi-sensor sequential change detection: A combinatorial multi-armed bandit approach, in: Proceedings of the AAAI Conference on Artificial Intelligence, 2019, pp. 5733–5740.

[419]

J. Zhang, H. Ning, Online kernel classification with adjustable bandwidth using control-based learning approach, Pattern Recogn. 108 (2020).

[420]

J. Zhang, H. Ning, X. Jing, T. Tian, Online kernel learning with adaptive bandwidth by optimal control approach, in: IEEE Transactions on Neural Networks and Learning Systems, 2020.

[421]

L. Zhang, R. Jin, C. Chen, J. Bu, X. He. Efficient online learning for large-scale sparse kernel logistic regression., in: AAAI, 2012.

[422]

L. Zhang, T. Yang, R. Jin, Y. Xiao, Z.H. Zhou, Online stochastic linear optimization under one-bit feedback, International Conference on Machine Learning (2016) 392–401.

[423]

T. Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms, in: Proc 21th International Conference on Machine Learning (ICML’04), 2004.

[424]

T. Zhang, Data dependent concentration bounds for sequential prediction algorithms, in: 18th Annual Conference on Learning Theory(COLT’05), 2005, pp. 173–187.

[425]

W. Zhang, P. Zhao, W. Zhu, S.C. Hoi, T. Zhang, Projection-free distributed online learning in networks, International Conference on Machine Learning (2017) 4054–4062.

[426]

X. Zhang, T. Yang, P. Srinivasan, Online asymmetric active learning with imbalanced data, KDD. (2016).

[427]

P. Zhao, Kernel based online learning. Ph.D. thesis. Nanyang Technological University, 2013.

[428]

P. Zhao, S.C. Hoi, Otl: a framework of online transfer learning, in: Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010, pp. 1231–1238.

[429]

P. Zhao, S.C. Hoi, Bduol: double updating online learning on a fixed budget, Machine Learning and Knowledge Discovery in Databases (2012) 810–826.

[430]

P. Zhao, S.C. Hoi, Cost-sensitive online active learning with application to malicious url detection, in: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining ACM, 2013, pp. 919–927.

[431]

P. Zhao, S.C. Hoi, R. Jin, Duol: A double updating approach for online learning, Advances in Neural Information Processing Systems (2009) 2259–2267.

[432]

P. Zhao, S.C. Hoi, R. Jin, Double updating online learning, J. Mach. Learn. Res. 12 (2011) 1587–1615.

[433]

P. Zhao, S.C. Hoi, J. Wang, B. Li, Online transfer learning, Artif. Intell. 216 (2014) 76–102.

Digital Library

[434]

P. Zhao, S.C.H. Hoi, J. Zhuang, Active learning with expert advice, in: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, 2013.

[435]

P. Zhao, R. Jin, T. Yang, S.C. Hoi, Online auc maximization, in: Proceedings of the 28th International Conference on Machine Learning (ICML-11), 2011b, pp. 233–240.

[436]

P. Zhao, J. Wang, P. Wu, R. Jin, S.C.H. Hoi, ICML (Ed.), Fast bounded online gradient descent algorithms for scalable kernel-based online learning, 2012.

[437]

P. Zhao, Y. Zhang, M. Wu, S.C. Hoi, M. Tan, J. Huang, Adaptive cost-sensitive online classification, IEEE Trans. Knowl. Data Eng. 31 (2018) 214–228.

Digital Library

[438]

P. Zhao, F. Zhuang, M. Wu, X.L. Li, S.C. Hoi, Cost-sensitive online classification with adaptive regularization and its applications, in: Data Mining (ICDM), 2015 IEEE International Conference on IEEE, 2015, pp. 649–658.

[439]

Y. Zheng, J. Jestes, J.M. Phillips, F. Li, Quality and efficiency for kernel density estimates in large data, in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, 2013, pp. 433–444.

[440]

A. Zhou, Z. Cai, L. Wei, W. Qian, M-kernel merging: Towards density estimation over data streams, DASFAA, IEEE. (2003) 285–292.

[441]

L. Zhou, A survey on contextual multi-armed bandits, CoRR (2015) abs/1508.03326.

[442]

X. Zhu, Semi-supervised learning literature survey, Computer Science, University of Wisconsin-Madison 2 (2006) 4.

[443]

X. Zhu, Z. Ghahramani, J.D. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in: Proceedings of the 20th International conference on Machine learning (ICML-03), 2003, pp. 912–919.

[444]

X. Zhu, A.B. Goldberg, Introduction to semi-supervised learning, Synthesis lectures on artificial intelligence and machine learning 3 (2009) 1–130.

[445]

M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in: Proceedings of the Twentieth International Conference on Machine Learning(ICML 2003), 2003, pp. 928–936.

[446]

M. Zoghi, T. Tunys, M. Ghavamzadeh, B. Kveton, C. Szepesvari, Z. Wen, Online learning to rank in stochastic click models, International Conference on Machine Learning (2017) 4199–4208.

Cited By

Wang THuang SBao ZCulpepper JDedeoglu VArablouei R(2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 3-May-2024
https://doi.org/10.14778/3648160.3648172
Sheferaw GMwangi WKimwele MMamuye A(2024)Gated recurrent unit predictor model-based adaptive differential pulse code modulation speech decoderEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-023-00325-32024:1Online publication date: 20-Jan-2024
https://dl.acm.org/doi/10.1186/s13636-023-00325-3
Liu YAwang MMohamad Zain F(2024)Research on online learning behavior classification method based on LSTM and CNNProceedings of the 2024 International Symposium on Artificial Intelligence for Education10.1145/3700297.3700314(94-100)Online publication date: 6-Sep-2024
https://dl.acm.org/doi/10.1145/3700297.3700314
Show More Cited By

Index Terms

Online learning: A comprehensive survey
1. Computing methodologies
  1. Machine learning
2. Theory of computation
  1. Design and analysis of algorithms
  2. Theory and algorithms for application domains
    1. Machine learning theory

Index terms have been assigned to the content through auto-classification.

Recommendations

Online Compact Convexified Factorization Machine
WWW '18: Proceedings of the 2018 World Wide Web Conference

Factorization Machine (FM) is a supervised learning approach with a powerful capability of feature engineering. It yields state-of-the-art performances in various batch learning tasks where all the training data is made available prior to the training. ...
Online Inverse Reinforcement Learning Under Occlusion
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems

Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from observing its behavior on a task. While this problem is witnessing sustained attention, the related problem of online IRL - where the observations are ...
Research on the relationship between learning engagement and learning performance in online learning
ICETC '23: Proceedings of the 15th International Conference on Education Technology and Computers

With the continuous development and popularisation of internet technology, online learning has become an indispensable learning method in the field of education. However, the relationship between learning engagement and learning performance in online ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 459, Issue C

Oct 2021

495 pages

ISSN:0925-2312

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 12 October 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

86
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang THuang SBao ZCulpepper JDedeoglu VArablouei R(2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 3-May-2024
https://doi.org/10.14778/3648160.3648172
Sheferaw GMwangi WKimwele MMamuye A(2024)Gated recurrent unit predictor model-based adaptive differential pulse code modulation speech decoderEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-023-00325-32024:1Online publication date: 20-Jan-2024
https://dl.acm.org/doi/10.1186/s13636-023-00325-3
Liu YAwang MMohamad Zain F(2024)Research on online learning behavior classification method based on LSTM and CNNProceedings of the 2024 International Symposium on Artificial Intelligence for Education10.1145/3700297.3700314(94-100)Online publication date: 6-Sep-2024
https://dl.acm.org/doi/10.1145/3700297.3700314
Liu ZYang ROuyang JJiang WYe TZhang MHuang SHuang JSong CZhang DWo THu C(2024)Kale: Elastic GPU Scheduling for Online DL Model TrainingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698532(36-51)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698532
Yu LLi PShao YLiu YMa C(2024)Integrating Online Learning and Causal Inference Strategies for Big Data Analysis and PredictionProceedings of the 2024 6th International Conference on Big Data Engineering10.1145/3688574.3688576(9-16)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1145/3688574.3688576
Xu SChen CLiu ZJin XYuan LYan YQu H(2024)Memory Reviver: Supporting Photo-Collection Reminiscence for People with Visual Impairment via a Proactive ChatbotProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676336(1-17)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676336
Tosi MVenugopal VTheobald M(2024)TensAIR: Real-Time Training of Neural Networks from Data-streamsProceedings of the 2024 8th International Conference on Machine Learning and Soft Computing10.1145/3647750.3647762(73-82)Online publication date: 26-Jan-2024
https://dl.acm.org/doi/10.1145/3647750.3647762
Alshahwan NHarman MHarper IMarginean ASengupta SWang E(2024)Assured Offline LLM-Based Software EngineeringProceedings of the ACM/IEEE 2nd International Workshop on Interpretability, Robustness, and Benchmarking in Neural Software Engineering10.1145/3643661.3643953(7-12)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643661.3643953
Bian SLiu MZhou BLukowicz PMagno M(2024)Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer InteractionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435558:1(1-49)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3643555
Liu ZQiu RZeng ZZhu YHamann HTong HBaeza-Yates RBonchi F(2024)AIM: Attributing, Interpreting, Mitigating Data UnfairnessProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671797(2014-2025)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671797
Show More Cited By

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents