Abstract
Multi-label image classification aims to predict all possible labels in an image. It is usually formulated as a partial-label learning problem, given the fact that it could be expensive in practice to annotate all labels in every training image. Existing works on partial-label learning focus on the case where each training image is annotated with only a subset of its labels. A special case is to annotate only one positive label in each training image. To further relieve the annotation burden and enhance the performance of the classifier, this paper proposes a new partial-label setting in which only a subset of the training images are labeled, each with only one positive label, while the rest of the training images remain unlabeled. To handle this new setting, we propose an end-to-end deep network, PLMCL (Partial-Label Momentum Curriculum Learning), that can learn to produce confident pseudo labels for both partially-labeled and unlabeled training images. The novel momentum-based law updates soft pseudo labels on each training image with the consideration of the updating velocity of pseudo labels, which help avoid trapping to low-confidence local minimum, especially at the early stage of training in lack of both observed labels and confidence on pseudo labels. In addition, we present a confidence-aware scheduler to adaptively perform easy-to-hard learning for different labels. Extensive experiments demonstrate that our proposed PLMCL outperforms many state-of-the-art multi-label classification methods under various partial-label settings on three different datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arazo, E., Ortego, D., Albert, P., O’Connor, N.E., McGuinness, K.: Pseudo-labeling and confirmation bias in deep semi-supervised learning. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: 26th annual International Conference on Machine Learning (ICML), pp. 41–48 (2009)
Berthelot, D., et al.: ReMixMatch: semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv:1911.09785 (2019)
Bucak, S.S., Jin, R., Jain, A.K.: Multi-label learning with incomplete class assignments. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2801–2808. IEEE (2011)
Cabral, R.S., Torre, F., Costeira, J.P., Bernardino, A.: Matrix completion for multi-label image classification. In: Advances in Neural Information Processing Systems, pp. 190–198 (2011)
Chapelle, O., Scholkopf, B., Zien, A.: Semi-supervised learning. IEEE Trans. Neural Netw. 20(3), 542 (2009)
Chen, M., Zheng, A., Weinberger, K.: Fast image tagging. In: International Conference on Machine Learning (ICML), pp. 1274–1282. PMLR (2013)
Chu, H.-M., Yeh, C.-K., Wang, Y.-C.F.: Deep generative models for weakly-supervised multi-label classification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 409–425. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_25
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of Singapore. In: ACM International Conference on Image and Video Retrieval, pp. 1–9 (2009)
Cole, E., Mac Aodha, O., Lorieul, T., Perona, P., Morris, D., Jojic, N.: Multi-label learning from single positive labels. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 933–942 (2021)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
Deng, J., Russakovsky, O., Krause, J., Bernstein, M.S., Berg, A., Fei-Fei, L.: Scalable multi-label annotation. In: SIGCHI Conference on Human Factors in Computing Systems, pp. 3099–3102 (2014)
Durand, T., Mehrasa, N., Mori, G.: Learning a deep convnet for multi-label classification with partial labels. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 647–657 (2019)
Everingham, M., Winn, J.: The pascal visual object classes challenge 2012 (voc2012) development kit. Pattern Anal. Statist. Model. Comput. Learn. Tech. Rep 8, 5 (2011)
Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)
Guo, S., Guo, S., et al.: CurriculumNet: weakly supervised learning from large-scale web images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 139–154. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_9
Huynh, D., Elhamifar, E.: Interactive multi-label CNN learning with partial labels. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9423–9432 (2020)
Jean, S., Firat, O., Johnson, M.: Adaptive scheduling for multi-task learning. arXiv preprint arXiv:1909.06434 (2019)
Jiang, L., Meng, D., Zhao, Q., Shan, S., Hauptmann, A.G.: Self-paced curriculum learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Kapoor, A., Viswanathan, R., Jain, P.: Multilabel classification using Bayesian compressed sensing. Adv. Neural. Inf. Process. Syst. 25, 2645–2653 (2012)
Kumar, M., Packer, B., Koller, D.: Self-paced learning for latent variable models. Adv. Neural. Inf. Process. Syst. 23, 1189–1197 (2010)
Kundu, K., Tighe, J.: Exploiting weakly supervised visual patterns to learn from partial annotations. Adv. Neural. Inf. Process. Syst. 33, 561–572 (2020)
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, Y., Jin, R., Yang, L.: Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: AAAI, vol. 6, pp. 421–426 (2006)
Mac Aodha, O., Cole, E., Perona, P.: Presence-only geographical priors for fine-grained image classification. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9596–9606 (2019)
Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 185–201. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_12
Niu, X., Han, H., Shan, S., Chen, X.: Multi-label co-regularization for semi-supervised facial action unit recognition. arXiv preprint arXiv:1910.11012 (2019)
Pineda, L., Salvador, A., Drozdzal, M., Romero, A.: Elucidating image-to-set prediction: an analysis of models, losses and datasets. CoRR (2019)
Rizve, M.N., Duarte, K., Rawat, Y.S., Shah, M.: In defense of pseudo-labeling: an uncertainty-aware pseudo-label selection framework for semi-supervised learning. In: International Conference on Learning Representations (2021)
Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2168–2178 (2019)
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. Adv. Neural. Inf. Process. Syst. 33, 596–608 (2020)
Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: IEEE International Conference on Computer Vision (ICCV), pp. 843–852 (2017)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)
Tanaka, D., Ikami, D., Yamasaki, T., Aizawa, K.: Joint optimization framework for learning with noisy labels. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5552–5560 (2018)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems 30 (2017)
Wang, B., Tu, Z., Tsotsos, J.K.: Dynamic label propagation for semi-supervised multi-class multi-label classification. In: IEEE International Conference on Computer Vision (ICCV), pp. 425–432 (2013)
Wang, L., Ding, Z., Fu, Y.: Adaptive graph guided embedding for multi-label annotation. In: IJCAI (2018)
Wang, Q., Shen, B., Wang, S., Li, L., Si, L.: Binary codes embedding for fast image tagging with incomplete labels. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 425–439. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_28
Wang, X., Chen, Y., Zhu, W.: A survey on curriculum learning. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Wu, B., Lyu, S., Ghanem, B.: ML-MG: multi-label learning with missing labels using a mixed graph. In: IEEE International Conference on Computer Vision (ICCV), pp. 4157–4165 (2015)
Xu, M., Jin, R., Zhou, Z.H.: Speedup matrix completion with side information: application to multi-label learning. In: Advances in Neural Information Processing Systems, pp. 2301–2309 (2013)
Yang, H., Zhou, J.T., Cai, J.: Improving multi-label learning with missing labels by structured semantic correlations. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 835–851. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_50
Yu, H.F., Jain, P., Kar, P., Dhillon, I.: Large-scale multi-label learning with missing labels. In: International Conference on Machine Learning (ICML), pp. 593–601. PMLR (2014)
Acknowledgements
The authors gratefully acknowledge the partial financial support of the National Science Foundation (1830512).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Abdelfattah, R., Zhang, X., Wu, Z., Wu, X., Wang, X., Wang, S. (2023). PLMCL: Partial-Label Momentum Curriculum Learning for Multi-label Image Classification. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-25063-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25062-0
Online ISBN: 978-3-031-25063-7
eBook Packages: Computer ScienceComputer Science (R0)