More Web Proxy on the site http://driver.im/

Article

Alleviating Catastrophic Interference in Online Learning via Varying Scale of Backward Queried Data

Author:

Gio HuhAuthors Info & Claims

Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part III

Pages 247 - 256

https://doi.org/10.1007/978-3-030-92238-2_21

Published: 08 December 2021 Publication History

Abstract

In recent years, connectionist networks have become a staple in real world systems due to their ability to generalize and find intricate relationships and patterns in data. One inherent limitation to connectionist networks, however, is catastrophic interference, an inclination to lose retention of previously formed knowledge when training with new data. This hindrance has been especially evident in online machine learning, where data is fed sequentially into the connectionist network. Previous methods, such as rehearsal and pseudo-rehearsal systems, have attempted to alleviate catastrophic interference by introducing past data or replicated data into the data stream. While these methods have proven to be effective, they add additional complexity to the model and require the saving of previous data.

In this paper, we propose a comprehensive array of low-cost online approaches to alleviating catastrophic interference by incorporating three different scales of backward queried data into the online optimization algorithm; more specifically, we averaged the gradient signal of the optimization algorithm with that of the backward queried data of the classes in the data set. Through testing our method with online stochastic gradient descent as the benchmark, we see improvements in the performance of the neural net-work, achieving an accuracy approximately seven percent higher with significantly less variance.

References

[1]

Cesa-Bianchi N, Long PM, and Warmuth M Worst-case quadratic loss bounds for prediction using linear functions and gradient descent IEEE Trans. Neural Networks 1996 7 3 604-619

Digital Library

[2]

Kivinen J and Warmuth MK Exponentiated gradient versus gradient descent for linear predictors Inf. Comput. 1997 132 1 1-63

Digital Library

[3]

Bottou, L., LeCun, Y.: Large scale online learning. In: Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA (2004). Location (1999)

[4]

Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the 21st International Conference on Machine Learning (ICML), Banff, Alberta, Canada (2004)

[5]

McCloskey M and Cohen NJ Catastrophic interference in connectionist networks: the sequential learning problem Psychol. Learn. Motivation 1989 24 109-165

[6]

Ratcliff R Connectionist models of recognition memory: constraints imposed by learning and forgetting functions Psychol. Rev. 1990 97 2 285

[7]

Robins, A.: Catastrophic forgetting in neural networks: the role of rehearsal mechanisms. In: First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, pp. 65–68. IEEE (1993)

[8]

Robins A Catastrophic forgetting, rehearsal and pseudorehearsal J. Neural Comput. Artif. Intell. Cogn. Res. 1995 7 123-146

[9]

Robins A Sequential learning in neural networks: a review and a discussion of pseudore- hearsal based methods Intell. Data Anal. 2004 8 3 301-322

[10]

French, R.M.: Using pseudo-recurrent connectionist networks to solve the problem of sequential learning. In: Proceedings of the 19th Annual Cognitive Science Society Conference, vol. 16 (1997)

[11]

Williams, G.R., Goldfain, B., Lee, K., Gibson, J., Rehg, J.M., Theodorou, E.A: Locally weighted regression pseudo-rehearsal for adaptive model predictive control. In: Proceedings of the Conference on Robot Learning, PMLR, vol. 100, pp. 969–978 (2020)

[12]

Tibshirani R Regression shrinkage and selection via the lasso J. Roy. Stat. Soc. B 1996 58 1 267-288

[13]

Xiao, L.: Dual averaging method for regularized stochastic learning and online optimization. In: Advanced in Neural Information Processing Systems 11 (2009)

[14]

Langford, J., Li, L., Zhang, T.: Sparse online learning via truncated gradient. In: Advanced in Neural Information Processing Systems 21 (2008)

[15]

Rebuffi, S.-A., Kolesnikov, A., Sperl, G., Lampert, C.H.: Incremental classifier and representation learning. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5533–5542. IEEE (2017)

[16]

Shin, H., Lee, J.K., Kim, J., Kim, J.: Continual learning with deep generative replay. In: Advances in Neural Information Processing Systems, pp. 2990–2999 (2017)

[17]

Atkinson, C., McCane, B., Szymanski, L., Robins, A.V.: Pseudo-rehearsal: achieving deep reinforcement learning without catastrophic forgetting. CoRR, abs/1812.02464 (2018)

[18]

Mellado D, Saavedra C, Chabert S, and Salas R Barone DAC, Teles EO, and Brackmann CP Pseudorehearsal approach for incremental learning of deep convolutional neural networks Computational Neuroscience 2017 Cham Springer 118-126

[19]

Kemker, R., Kanan, C.: Fearnet: brain-inspired model for incremental learning. CoRR, abs/1711.10563 (2017)

[20]

Huh, G.: Enhanced stochastic gradient descent with backward queried data for online learning. In: IEEE International Conference on Machine Learning and Applied Network Technologies (2020)

Recommendations

A review of online learning in supervised neural networks

Learning in neural networks can broadly be divided into two categories, viz., off-line (or batch) learning and online (or incremental) learning. In this paper, a review of a variety of supervised neural networks with online learning capabilities is ...
Scalable hands-free transfer learning for online advertising
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Internet display advertising is a critical revenue source for publishers and online content providers, and is supported by massive amounts of user and publisher data. Targeting display ads can be improved substantially with machine learning methods, but ...
Overcoming Catastrophic Interference with Bayesian Learning and Stochastic Langevin Dynamics
Advances in Neural Networks – ISNN 2019
Abstract
Neural networks encounter serious catastrophic forgetting when information is learned sequentially. Although simply replaying all previous data alleviates the problem, it may require large memory to store all previous training examples. Even with ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part III

Dec 2021

723 pages

ISBN:978-3-030-92237-5

DOI:10.1007/978-3-030-92238-2

Editors:
Teddy Mantoro
Sampoerna University, Jakarta, Indonesia
,
Minho Lee
Kyungpook National University, Daegu, Korea (Republic of)
,
Media Anugerah Ayu
Sampoerna University, Jakarta, Indonesia
,
Kok Wai Wong
Murdoch University, Murdoch, WA, Australia
,
Achmad Nizar Hidayanto
Universitas Indonesia, Depok, Indonesia

© Springer Nature Switzerland AG 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 08 December 2021

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents