Abstract
Zero-shot sketch-based image retrieval (ZS-SBIR) has attracted increasing attention in computer vision, with the aim of searching the natural images that match a given sketch under the setting of zero-shot learning. In this paper, we propose a novel ZS-SBIR method based on the specifically developed domain-aware dual attention (DADA) module and similarity loss, which utilizes prior knowledge more fully than previous work. The DADA module emphasizes the importance of different channels and spaces according to the prior knowledge of whether the input is a sketch or a natural image. To preserve more prior knowledge from the models pre-trained on ImageNet, a similarity loss function is designed to supervise inter-class similarity. Furthermore, the center loss is introduced into our model, which can bridge the representation gap between the sketch domain and the natural image domain. We perform experiments on the TU-Berlin extended dataset and the Sketchy extended dataset, and the corresponding results verify the effectiveness of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
We declare that we have no financial or personal relationships with other people or organizations that can inappropriately influence our work. There is no professional or other personal interest of any nature or kind in any product, service or company that could be construed as influencing the position presented in, or the review of the manuscript entitled. The data that support the findings of this study are openly available.
References
Sangkloy, P., Burnell, N., Ham, C., Hays, J.: The sketchy database: learning to retrieve badly drawn bunnies. ACM Trans. Graphics (TOG) 35(4), 1–12 (2016)
Liu, L., Shen, F., Shen, Y., Liu, X., Shao, L.: Deep sketch hashing: Fast free-hand sketch-based image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2862–2871 (2017)
Zhang, J., Shen, F., Liu, L., Zhu, F., Yu, M., Shao, L., Tao Shen, H., Van Gool, L.: Generative domain-migration hashing for sketch-to-image retrieval. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 297–314 (2018)
Qi, Y., Song, Y.Z., Zhang, H., Liu, J.: Sketch based image retrieval via siamese convolutional neural network. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2460–2464 (2016)
Lu, P., Huang, G., Lin, H.Y., Yang, W.M., Guo, G.D., Fu, Y.W.: Domain-aware SE network for sketch-based image retrieval with multiplicative euclidean margin softmax. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3418–3426 (2021)
Liu, Q., Xie, L., Wang, H., Yuille, A.L.: Semantic-aware knowledge preservation for zero shot sketch-based image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3662–3671 (2019)
Zhu, J., Xu, X., Shen, F., Lee, R.K.W., Wang, Z., Shen, H.T.: Ocean: A dual learning approach for generalized zero-shot sketch-based image retrieval. In: 2020 IEEE International Conference on Multimedia and Expo, pp. 1–6 (2020)
Dutta, A., Akata, Z.: Semantically tied paired cycle consistency for zero-shot sketch-based image retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 2 5089–5098 (2019)
Dey, S., Riba, P., Dutta, A., Llados, J., Song, Y.Z.: Doodle to search: Practical zero-shot sketch-based image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2179–2188 (2019)
Zhang, Z., Zhang, Y., Feng, R., Zhang, T., Fan, W.: Zero-shot sketch-based image retrieval via graph convolution network. In: AAAI, pp. 12943–12950 (2020)
Hu, R., Collomosse, J.: A performance evaluation of gradient field hog descriptor for sketch based image retrieval. Comput. Vis. Image Underst. 117(7), 790–806 (2013)
Saavedra, J.M.: Sketch based image retrieval using a soft computation of the histogram of edge local orientations (s-helo). In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 2998–3002 (2014)
Saavedra, J.M., Barrios, J.M., Orand, S.: Sketch based image retrieval using learned key shapes(lks). In: Proceedings of the British Machine Vision Conference, pp. 164.1–164.11 (2015)
Guo, L., Liu, J., Wang, Y., Luo, Z., Wen, W., Lu, H.: Sketch-based image retrieval using generative adversarial networks. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1267–1268 (2017)
Wang, H., Deng, C., Liu, T., Tao, D.: Transferable coupled network for zero-shot sketch-based image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3123315
Tian, J.L., Xu, X., Wang, Z.: Relationship-preserving knowledge distillation for zero-shot sketch based image retrieval. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 5473–5481
Osman, T., Simon, D., Sridha, S., Ethan, G., Clinton, F.: An efficient framework for zero-shot sketch-based image retrieval. Pattern Recogn. 126, 108528 (2022)
Dutta, T., Biswas, S.: Style-guided zero-shot sketch-based image retrieval. In: 30th British Machine Vision Conference, p. 209 (2019)
Shen, Y., Liu, L., Shen, F., Shao, L.: Zero-shot sketch-image hashing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3598–3607 (2018)
Kumar Verma, V., Mishra, A., Mishra, A., Rai, P.: Generative model for zero-shot sketch-based image retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 704–713 (2019)
Li, J., Ling, Z., Niu, L., Zhang, L.: Bidirectional domain translation for zero-shot sketch-based image retrieval. Preprint https://arxiv.org/abs/1911.13251
Deng, C., Xu, X., Wang, H.: Progressive cross-modal semantic network for zero-shot sketch- based image retrieval. IEEE Trans. Image Process. 29, 8892–8902 (2020)
Xu, X., Yang, M., Yang, Y., Wang, H.: Progressive domain-independent feature decomposition network for zero-shot sketch-based image retrieval. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, pp. 984–990 (2020)
Lin, K.Y., Xu, X., Gao, L.L., Wang, Z., Shen, H.T.: Learning cross-aligned latent embeddings for zero-shot cross-modal retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 11515–11522 (2020)
Wen, Y.D., Zhang, K.P., Li, Z.F. et al.: A discriminative feature learning approach for deep face recognition. In: European Conference on Computer Vision, pp. 499–515 (2016)
Hu, J., Shen, L., Albanie, S., et al.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2017)
Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graphics (TOG) 31(4), 44:1-44:10 (2012)
Yelamarthi, S.K., Reddy, S.K., Mishra, A., Mittal, A.: A zero-shot framework for sketch based image retrieval. In: European Conference on Computer Vision, Springer, pp. 316–333 (2018)
Acknowledgements
This work was supported by National Natural Science Foundation of China (62202003), Key Scientific Research Project of in Anhui Province (2022AH050093), Science and Technology Major Project of Anhui Province (202203a05020027) and Major University Science Research Project of AnHui Province (KJ2021ZD0004).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, M., Zhao, C., Wang, N. et al. Domain-aware double attention network for zero-shot sketch-based image retrieval with similarity loss. Vis Comput 40, 3091–3101 (2024). https://doi.org/10.1007/s00371-023-03012-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03012-8