Abstract
The paradigm of machine intelligence moves from purely supervised learning to a more practical scenario when many loosely related unlabeled data are available and labeled data is scarce. Most existing algorithms assume that the underlying task distribution is stationary. Here we consider a more realistic and challenging setting in that task distributions evolve over time. We name this problem as Semi-supervised meta-learning with Evolving Task diStributions, abbreviated as SETS. Two key challenges arise in this more realistic setting: (i) how to use unlabeled data in the presence of a large amount of unlabeled out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting on previously learned task distributions due to the task distribution shift. We propose an OOD Robust and knowleDge presErved semi-supeRvised meta-learning approach (ORDER) (we use ORDER to denote the task distributions sequentially arrive with some ORDER), to tackle these two major challenges. Specifically, our ORDER introduces a novel mutual information regularization to robustify the model with unlabeled OOD data and adopts an optimal transport regularization to remember previously learned knowledge in feature space. In addition, we test our method on a very challenging dataset: SETS on large-scale non-stationary semi-supervised task distributions consisting of (at least) 72K tasks. With extensive experiments, we demonstrate the proposed ORDER alleviates forgetting on evolving task distributions and is more robust to OOD data than related strong baselines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., Kolter, Z.: Differentiable convex optimization layers. In: Advances in Neural Information Processing Systems (2019)
Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M., Tuytelaars, T.: Memory aware synapses: learning what (not) to forget. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_9
Aljundi, R., Kelchtermans, K., Tuytelaars, T.: Task-free continual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Andrychowicz, M., et al.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems (2016)
Antoniou, A., Edwards, H., Storkey, A.: How to train your maml. In: International Conference on Learning Representations (2019)
Antoniou, A., Patacchiola, M., Ochal, M., Storkey, A.: Defining benchmarks for continual few-shot learning (2020). https://arxiv.org/abs/2004.11967
Bae, I., et al.: Self-driving like a human driver instead of a robocar: Personalized comfortable driving experience for autonomous vehicles. In: NeurIPS Workshop (2019)
Balaji, Y., Chellappa, R., Feizi, S.: Robust optimal transport with applications in generative modeling and domain adaptation. In: Advances in Neural Information Processing Systems, pp. 12934–12944 (2020)
Barber, D., Agakov, F.: The im algorithm: A variational approach to information maximization (2003)
Belouadah, E., Popescu, A.: Il2m: Class incremental learning with dual memory. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 583–592 (2019)
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.: Mixmatch: A holistic approach to semi-supervised learning (2019)
Bertinetto, L., Henriques, J.F., Torr, P.H.S., Vedaldi, A.: Meta-learning with differentiable closed-form solvers. In: International Conference on Learning Representations (2019)
Buzzega, P., Boschini, M., Porrello, A., Abati, D., Calderara, S.: Dark experience for general continual learning: a strong, simple baseline. In: 34th Conference on Neural Information Processing Systems (2020)
Chaudhry, A., Ranzato, M., Rohrbach, M., Elhoseiny, M.: Efficient lifelong learning with a-gem. In: Proceedings of the International Conference on Learning Representations (2019)
Chaudhry, A., et al.: Continual learning with tiny episodic memories (2019). https://arxiv.org/abs/1902.10486
Chen, T., Wu, W., Gao, Y., Dong, L., Luo, X., Lin, L.: Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding. In: ACM International Conference on Multimedia (2018)
Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C., Huang, J.B.: A closer look at few-shot classification. In: International Conference on Learning Representations (2019)
Cheng, P., Hao, W., Dai, S., Liu, J., Gan, Z., Carin, L.: Club: A contrastive log-ratio upper bound of mutual information. In: Proceedings of the 37th International Conference on Machine Learning (2020)
Diethe, T.: Practical considerations for continual learning (Amazon) (2020)
Ebrahimi, S., Meier, F., Calandra, R., Darrell, T., Rohrbach, M.: Adversarial continual learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 386–402. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_23
Edwards, H., Storkey, A.: Towards a neural statistician. arXiv: 6060.2185 (2017)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (2017)
Finn, C., Rajeswaran, A., Kakade, S., Levine, S.: Online meta-learning. In: Proceedings of International Conference on Machine Learning (2019)
Finn, C., Xu, K., Levine, S.: Probabilistic model-agnostic meta-learning. In: Advances in Neural Information Processing Systems (2018)
Genevay, A., Peyré, G., Cuturi, M.: Learning generative models with sinkhorn divergences (2018)
Guo, J., Zhu, X., Zhao, C., Cao, D., Lei, Z., Li, S.Z.: Learning meta face recognition in unseen domains. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Guo, X., Yang, C., Li, B., Yuan, Y.: Metacorrection: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Horn, G.V., et al.: The inaturalist species classification and detection dataset. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Huang, X., Qi, J., Sun, Y., Zhang, R.: Semi-supervised dialogue policy learning via stochastic reward estimation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2014)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences (2017)
Lake, B., Salakhutdinov, R., Gross, J., Tenenbaum, J.: One shot learning of simple visual concepts. In: Conference of the Cognitive Science Society (2011)
Lee, K., Lee, K., Shin, J., Lee, H.: Overcoming catastrophic forgetting with unlabeled data in the wild. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Lee, K., Maji, S., Ravichandran, A., Soatto, S.: Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Li, X., et al.: Learning to self-train for semi-supervised few-shot classification. In: Proceedings of the Advances in Neural Information Processing Systems (2019)
Liu, Y., et al.: Learning to propagate labels: Transductive propagation network for few-shot learning. In: International Conference on Learning Representations (2019)
Lopez-Paz, D., Ranzato, M.: Gradient episodic memory for continual learning. In: Advances in Neural Information Processing Systems (2017)
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft (2013). https://arxiv.org/abs/1306.5151
Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. In: International Conference on Learning Representations (2018)
Munkhdalai, T., Yu, H.: Meta networks. In: Proceedings of the 34th International Conference on Machine Learning (2017)
Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Nguyen, C.V., Li, Y., Bui, T.D., Turner, R.E.: Variational continual learning. In: Proceedings of the International Conference on Learning Representations (2018)
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes (2008)
Poole, B., Ozair, S., van den Oord, A., Alemi, A.A., Tucker, G.: On variational bounds of mutual information (2019)
Raghu, A., Raghu, M., Bengio, S., Vinyals, O.: Rapid learning or feature reuse? towards understanding the effectiveness of maml. In: International Conference on Learning Representations (2020)
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (2017)
Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: International Conference on Learning Representations (2018)
Riemer, M., et al.: Learning to learn without forgetting by maximizing transfer and minimizing interference. In: International Conference on Learning Representations (2019)
Saha, G., Garg, I., Roy, K.: Gradient projection memory for continual learning. In: International Conference on Learning Representations (2021)
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: Proceedings of the 34th International Conference on Machine Learning (2016)
Schmidhuber, J.: A neural network that embeds its own meta-levels. In: IEEE International Conference on Neural Networks (1993)
Sehwag, V., Chiang, M., Mittal, P.: Ssd: A unified framework for self-supervised outlier detection. In: International Conference on Learning Representations (2021)
Smith, J., Balloch, J., Hsu, Y.C., Kira, Z.: Memory-efficient semi-supervised continual learning: The world is its own replay buffer (2021). https://arxiv.org/abs/2101.09536
Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems (2017)
Tseng, H.Y., Lee, H.Y., Huang, J.B., Yang, M.H.: Cross-domain few-shot classification via learned feature-wise transformation. In: Proceedings of the International Conference on Learning Representations (2020)
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning (2016). https://arxiv.org/pdf/1606.04080.pdf
Vitter, J.S.: Random sampling with a reservoir. Association for Computing Machinery (1985)
Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Software 11, 37–57 (1985)
Vuorio, R., Sun, S.H., Hu, H., Lim, J.J.: Multimodal model-agnostic meta-learning via task-aware modulation. In: Proceedings of the Advances in Neural Information Processing Systems (2019)
Wang, L., Yang, K., Li, C., Hong, L., Li, Z., Zhu, J.: Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (October 2019)
Wang, Z., Duan, T., Fang, L., Suo, Q., Gao, M.: Meta learning on a sequence of imbalanced domains with difficulty awareness. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8947–8957 (2021)
Wang, Z., Shen, L., Duan, T., Zhan, D., Fang, L., Gao, M.: Learning to learn and remember super long multi-domain task sequence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7982–7992 (2022)
Wang, Z., Shen, L., Fang, L., Suo, Q., Duan, T., Gao, M.: Improving task-free continual learning by distributionally robust memory evolution. In: International Conference on Machine Learning, pp. 22985–22998 (2022)
Wang, Z., et al.: Meta-learning without data via wasserstein distributionally-robust model fusion. In: The Conference on Uncertainty in Artificial Intelligence (2022)
Wang, Z., Zhao, Y., Yu, P., Zhang, R., Chen, C.: Bayesian meta sampling for fast uncertainty adaptation. In: International Conference on Learning Representations (2020)
Welinder, P., et al.: Caltech-UCSD Birds 200 (2010)
Yoon, J., Yang, E., Lee, J., Hwang, S.J.: Lifelong learning with dynamically expandable networks. In: International Conference on Learning Representations (2018)
Zhou, Y., Wang, Z., Xian, J., Chen, C., Xu, J.: Meta-learning with neural tangent kernels. In: International Conference on Learning Representations (2021)
Zhuang, P., Wang, Y., Qiao, Y.: Wildfish: A large benchmark for fish recognition in the wild (2018)
Acknowledgement
We thank all the anonymous reviewers for their thoughtful and insightful comments. This research was supported in part by NSF through grant IIS-1910492.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Z. et al. (2022). Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-20044-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20043-4
Online ISBN: 978-3-031-20044-1
eBook Packages: Computer ScienceComputer Science (R0)