[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13680))

Included in the following conference series:

Abstract

The paradigm of machine intelligence moves from purely supervised learning to a more practical scenario when many loosely related unlabeled data are available and labeled data is scarce. Most existing algorithms assume that the underlying task distribution is stationary. Here we consider a more realistic and challenging setting in that task distributions evolve over time. We name this problem as Semi-supervised meta-learning with Evolving Task diStributions, abbreviated as SETS. Two key challenges arise in this more realistic setting: (i) how to use unlabeled data in the presence of a large amount of unlabeled out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting on previously learned task distributions due to the task distribution shift. We propose an OOD Robust and knowleDge presErved semi-supeRvised meta-learning approach (ORDER) (we use ORDER to denote the task distributions sequentially arrive with some ORDER), to tackle these two major challenges. Specifically, our ORDER introduces a novel mutual information regularization to robustify the model with unlabeled OOD data and adopts an optimal transport regularization to remember previously learned knowledge in feature space. In addition, we test our method on a very challenging dataset: SETS on large-scale non-stationary semi-supervised task distributions consisting of (at least) 72K tasks. With extensive experiments, we demonstrate the proposed ORDER alleviates forgetting on evolving task distributions and is more robust to OOD data than related strong baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., Kolter, Z.: Differentiable convex optimization layers. In: Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  2. Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_9

    Chapter  Google Scholar 

  3. Aljundi, R., Kelchtermans, K., Tuytelaars, T.: Task-free continual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  4. Andrychowicz, M., et al.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems (2016)

    Google Scholar 

  5. Antoniou, A., Edwards, H., Storkey, A.: How to train your maml. In: International Conference on Learning Representations (2019)

    Google Scholar 

  6. Antoniou, A., Patacchiola, M., Ochal, M., Storkey, A.: Defining benchmarks for continual few-shot learning (2020). https://arxiv.org/abs/2004.11967

  7. Bae, I., et al.: Self-driving like a human driver instead of a robocar: Personalized comfortable driving experience for autonomous vehicles. In: NeurIPS Workshop (2019)

    Google Scholar 

  8. Balaji, Y., Chellappa, R., Feizi, S.: Robust optimal transport with applications in generative modeling and domain adaptation. In: Advances in Neural Information Processing Systems, pp. 12934–12944 (2020)

    Google Scholar 

  9. Barber, D., Agakov, F.: The im algorithm: A variational approach to information maximization (2003)

    Google Scholar 

  10. Belouadah, E., Popescu, A.: Il2m: Class incremental learning with dual memory. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 583–592 (2019)

    Google Scholar 

  11. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: Mixmatch: A holistic approach to semi-supervised learning (2019)

    Google Scholar 

  12. Bertinetto, L., Henriques, J.F., Torr, P.H.S., Vedaldi, A.: Meta-learning with differentiable closed-form solvers. In: International Conference on Learning Representations (2019)

    Google Scholar 

  13. Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. In: 34th Conference on Neural Information Processing Systems (2020)

    Google Scholar 

  14. Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with a-gem. In: Proceedings of the International Conference on Learning Representations (2019)

    Google Scholar 

  15. Chaudhry, A., et al.: Continual learning with tiny episodic memories (2019). https://arxiv.org/abs/1902.10486

  16. Chen, T., Wu, W., Gao, Y., Dong, L., Luo, X., Lin, L.: Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding. In: ACM International Conference on Multimedia (2018)

    Google Scholar 

  17. Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C., Huang, J.B.: A closer look at few-shot classification. In: International Conference on Learning Representations (2019)

    Google Scholar 

  18. Cheng, P., Hao, W., Dai, S., Liu, J., Gan, Z., Carin, L.: Club: A contrastive log-ratio upper bound of mutual information. In: Proceedings of the 37th International Conference on Machine Learning (2020)

    Google Scholar 

  19. Diethe, T.: Practical considerations for continual learning (Amazon) (2020)

    Google Scholar 

  20. Ebrahimi, S., Meier, F., Calandra, R., Darrell, T., Rohrbach, M.: Adversarial continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 386–402. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_23

    Chapter  Google Scholar 

  21. Edwards, H., Storkey, A.: Towards a neural statistician. arXiv: 6060.2185 (2017)

  22. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (2017)

    Google Scholar 

  23. Finn, C., Rajeswaran, A., Kakade, S., Levine, S.: Online meta-learning. In: Proceedings of International Conference on Machine Learning (2019)

    Google Scholar 

  24. Finn, C., Xu, K., Levine, S.: Probabilistic model-agnostic meta-learning. In: Advances in Neural Information Processing Systems (2018)

    Google Scholar 

  25. Genevay, A., Peyré, G., Cuturi, M.: Learning generative models with sinkhorn divergences (2018)

    Google Scholar 

  26. Guo, J., Zhu, X., Zhao, C., Cao, D., Lei, Z., Li, S.Z.: Learning meta face recognition in unseen domains. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  27. Guo, X., Yang, C., Li, B., Yuan, Y.: Metacorrection: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  28. Horn, G.V., et al.: The inaturalist species classification and detection dataset. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  29. Huang, X., Qi, J., Sun, Y., Zhang, R.: Semi-supervised dialogue policy learning via stochastic reward estimation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)

    Google Scholar 

  30. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2014)

    Google Scholar 

  31. Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences (2017)

    Google Scholar 

  32. Lake, B., Salakhutdinov, R., Gross, J., Tenenbaum, J.: One shot learning of simple visual concepts. In: Conference of the Cognitive Science Society (2011)

    Google Scholar 

  33. Lee, K., Lee, K., Shin, J., Lee, H.: Overcoming catastrophic forgetting with unlabeled data in the wild. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  34. Lee, K., Maji, S., Ravichandran, A., Soatto, S.: Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  35. Li, X., et al.: Learning to self-train for semi-supervised few-shot classification. In: Proceedings of the Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  36. Liu, Y., et al.: Learning to propagate labels: Transductive propagation network for few-shot learning. In: International Conference on Learning Representations (2019)

    Google Scholar 

  37. Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  38. Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft (2013). https://arxiv.org/abs/1306.5151

  39. Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. In: International Conference on Learning Representations (2018)

    Google Scholar 

  40. Munkhdalai, T., Yu, H.: Meta networks. In: Proceedings of the 34th International Conference on Machine Learning (2017)

    Google Scholar 

  41. Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  42. Nguyen, C.V., Li, Y., Bui, T.D., Turner, R.E.: Variational continual learning. In: Proceedings of the International Conference on Learning Representations (2018)

    Google Scholar 

  43. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes (2008)

    Google Scholar 

  44. Poole, B., Ozair, S., van den Oord, A., Alemi, A.A., Tucker, G.: On variational bounds of mutual information (2019)

    Google Scholar 

  45. Raghu, A., Raghu, M., Bengio, S., Vinyals, O.: Rapid learning or feature reuse? towards understanding the effectiveness of maml. In: International Conference on Learning Representations (2020)

    Google Scholar 

  46. Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (2017)

    Google Scholar 

  47. Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: International Conference on Learning Representations (2018)

    Google Scholar 

  48. Riemer, M., et al.: Learning to learn without forgetting by maximizing transfer and minimizing interference. In: International Conference on Learning Representations (2019)

    Google Scholar 

  49. Saha, G., Garg, I., Roy, K.: Gradient projection memory for continual learning. In: International Conference on Learning Representations (2021)

    Google Scholar 

  50. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: Proceedings of the 34th International Conference on Machine Learning (2016)

    Google Scholar 

  51. Schmidhuber, J.: A neural network that embeds its own meta-levels. In: IEEE International Conference on Neural Networks (1993)

    Google Scholar 

  52. Sehwag, V., Chiang, M., Mittal, P.: Ssd: A unified framework for self-supervised outlier detection. In: International Conference on Learning Representations (2021)

    Google Scholar 

  53. Smith, J., Balloch, J., Hsu, Y.C., Kira, Z.: Memory-efficient semi-supervised continual learning: The world is its own replay buffer (2021). https://arxiv.org/abs/2101.09536

  54. Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  55. Tseng, H.Y., Lee, H.Y., Huang, J.B., Yang, M.H.: Cross-domain few-shot classification via learned feature-wise transformation. In: Proceedings of the International Conference on Learning Representations (2020)

    Google Scholar 

  56. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning (2016). https://arxiv.org/pdf/1606.04080.pdf

  57. Vitter, J.S.: Random sampling with a reservoir. Association for Computing Machinery (1985)

    Google Scholar 

  58. Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Software 11, 37–57 (1985)

    Article  MathSciNet  Google Scholar 

  59. Vuorio, R., Sun, S.H., Hu, H., Lim, J.J.: Multimodal model-agnostic meta-learning via task-aware modulation. In: Proceedings of the Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  60. Wang, L., Yang, K., Li, C., Hong, L., Li, Z., Zhu, J.: Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  61. Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (October 2019)

    Google Scholar 

  62. Wang, Z., Duan, T., Fang, L., Suo, Q., Gao, M.: Meta learning on a sequence of imbalanced domains with difficulty awareness. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8947–8957 (2021)

    Google Scholar 

  63. Wang, Z., Shen, L., Duan, T., Zhan, D., Fang, L., Gao, M.: Learning to learn and remember super long multi-domain task sequence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7982–7992 (2022)

    Google Scholar 

  64. Wang, Z., Shen, L., Fang, L., Suo, Q., Duan, T., Gao, M.: Improving task-free continual learning by distributionally robust memory evolution. In: International Conference on Machine Learning, pp. 22985–22998 (2022)

    Google Scholar 

  65. Wang, Z., et al.: Meta-learning without data via wasserstein distributionally-robust model fusion. In: The Conference on Uncertainty in Artificial Intelligence (2022)

    Google Scholar 

  66. Wang, Z., Zhao, Y., Yu, P., Zhang, R., Chen, C.: Bayesian meta sampling for fast uncertainty adaptation. In: International Conference on Learning Representations (2020)

    Google Scholar 

  67. Welinder, P., et al.: Caltech-UCSD Birds 200 (2010)

    Google Scholar 

  68. Yoon, J., Yang, E., Lee, J., Hwang, S.J.: Lifelong learning with dynamically expandable networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  69. Zhou, Y., Wang, Z., Xian, J., Chen, C., Xu, J.: Meta-learning with neural tangent kernels. In: International Conference on Learning Representations (2021)

    Google Scholar 

  70. Zhuang, P., Wang, Y., Qiao, Y.: Wildfish: A large benchmark for fish recognition in the wild (2018)

    Google Scholar 

Download references

Acknowledgement

We thank all the anonymous reviewers for their thoughtful and insightful comments. This research was supported in part by NSF through grant IIS-1910492.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Li Shen or Mingchen Gao .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 464 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Z. et al. (2022). Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20044-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20043-4

  • Online ISBN: 978-3-031-20044-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics