Abstract
Clothing classification serves as a fundamental task for clothing retrieval, clothing recommendation, etc. In this task, there are two inherent challenges: suppressing complex backgrounds outside the clothing region and disentangling the feature entanglement of shape-similar clothing samples. These challenges arise from insufficient attention to key distinctions of clothing, which hinders the accuracy of clothing classification. Also, the high computational resource requirement of some complex and large-scale models also decreases the inference efficiency. To tackle these challenges, we propose a new COntext-driven Clothing ClassIfication network (COCCI), which improves inference accuracy while reducing model complexity. First, we design a self-adaptive attention fusion (SAAF) module to enhance category-exclusive clothing features and prevent misclassification by suppressing ineffective features with confused image contexts. Second, we propose a novel multi-scale feature aggregation (MSFA) module to establish spatial context correlations by using multi-scale clothing features. This helps disentangle feature entanglement among shape-similar clothing samples. Finally, we introduce knowledge distillation to extract reliable teacher knowledge from complex datasets, which helps student models learn clothing features with rich representation information, thereby improving generalization while reducing model complexity. In comparison to state-of-the-art networks trained with one single model, our method achieves SOTA performance on the widely-used clothing classification benchmark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: ICLR, pp. 1–12 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Lan, S., Li, J., Hu, S., Fan, H., Pan, Z.: A neighbourhood feature-based local binary pattern for texture classification. Vis. Comput. 1–25 (2023)
Liu, Y., Dou, Y., Jin, R., Li, R., Qiao, P.: Hierarchical learning with backtracking algorithm based on the visual confusion label tree for large-scale image classification. Vis. Comput. 38(3), 897–917 (2022)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012–10022, October 2021
Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11976–11986 (2022)
Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: DeepFashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096–1104 (2016)
Shajini, M., Ramanan, A.: A knowledge-sharing semi-supervised approach for fashion clothes classification and attribute prediction. Vis. Comput. 38(11), 3551–3561 (2022)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Wang, W., Xu, Y., Shen, J., Zhu, S.C.: Attentive fashion grammar network for fashion landmark detection and clothing category classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4271–4280 (2018)
Woo, S., et al.: ConvNeXt V2: co-designing and scaling ConvNets with masked autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16133–16142, June 2023
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Xia, T.E., Zhang, J.Y.: Clothing classification using transfer learning with squeeze and excitation block. Multimedia Tools Appl. 82(2), 2839–2856 (2023)
Xiao, T., Xia, T., Yang, Y., Huang, C., Wang, X.: Learning from massive noisy labeled data for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2691–2699 (2015)
Xu, J., Wei, Y., Wang, A., Zhao, H., Lefloch, D.: Analysis of clothing image classification models: a comparison study between traditional machine learning and deep learning models. Fibres Text. East. Eur. 30(5), 66–78 (2022)
Yu, F., et al.: EnCaps: clothing image classification based on enhanced capsule network. Appl. Sci. 11(22), 11024 (2021)
Zeghoud, S., et al.: Real-time spatial normalization for dynamic gesture classification. Vis. Comput. 1–13 (2022)
Zhang, Y., Zhang, P., Yuan, C., Wang, Z.: Texture and shape biased two-stream networks for clothing classification and attribute recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13538–13547 (2020)
Zhou, Z., Liu, M., Deng, W., Wang, Y., Zhu, Z.: Clothing image classification with DenseNet201 network and optimized regularized random vector functional link. J. Nat. Fibers 20(1), 2190188 (2023)
Acknowledgements
This work was supported by the national natural science foundation of China (No. 62202346), Hubei key research and development program (No.2021BAA042), open project of engineering research center of Hubei province for clothing information (No. 2022HBCI01), Wuhan applied basic frontier research project (No. 2022013988065212), MIIT’s AI Industry Innovation Task unveils flagship projects (Key technologies, equipment, and systems for flexible customized and intelligent manufacturing in the clothing industry), and Hubei science and technology project of safe production special fund (Scene control platform based on proprioception information computing of artificial intelligence).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, M. et al. (2024). COCCI: Context-Driven Clothing Classification Network. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14495. Springer, Cham. https://doi.org/10.1007/978-3-031-50069-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-50069-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50068-8
Online ISBN: 978-3-031-50069-5
eBook Packages: Computer ScienceComputer Science (R0)