[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Deep Learning-Based Image Retrieval With Unsupervised Double Bit Hashing

Published: 18 April 2023 Publication History

Abstract

Unsupervised image hashing is a widely used technique for large-scale image retrieval. This technique maps an image to a finite length of binary codes without extensive human-annotated data for compact storage and effective semantic retrieval. This study proposes a novel deep unsupervised double-bit hashing method for image retrieval. This approach is based on the double-bit hashing method, which has been shown to better preserve the neighboring structure of binary codes than single-bit hashing. Traditional double-bit hashing methods require the entire dataset to be processed simultaneously to determine optimal thresholding values of binary feature encoding. In contrast, the proposed method trains the hashing layer in a minibatch manner, allowing for adaptive threshold learning through a gradient-based optimization strategy. Additionally, unlike most former methods, which only train the hashing networks on top of fixed pre-trained neural networks backbone. The proposed learning framework trains both hashing and backbone networks alternately asynchronously. This strategy enables the model to maximize the learning capability of the hashing and backbone networks. Furthermore, adopting the lightweight Vision Transformer (ViT) in the proposed method allows the model to capture both local and global relationships between multiple image views exemplar, which lead to better generalization, thus maximizing the retrieval performance of the model. Extensive experiments on CIFAR10, NUS-WIDE, and FLICKR25K datasets validate that the proposed method has superior retrieval quality and computational efficiency than state-of-the-art methods.

References

[1]
S. U. Rehman, S. Tu, Y. Huang, and Z. Yang, “Face recognition: A novel un-supervised convolutional neural network method,” in Proc. IEEE Int. Conf. Online Anal. Comput. Sci. (ICOACS), May 2016, pp. 139–144. 10.1109/ICOACS.2016.7563066.
[2]
S. U. Rehman, S. Tu, Y. Huang, and G. Liu, “CSFL: A novel unsupervised convolution neural network approach for visual pattern classification,” AI Commun., vol. 30, pp. 311–324, Jan. 2017. 10.3233/AIC-170739.
[3]
S. Rehman, S. Tu, O. Rehman, Y. Huang, C. Magurawalage, and C.-C. Chang, “Optimization of CNN through novel training strategy for visual classification problems,” Entropy, vol. 20, no. 4, p. 290, Apr. 2018. 10.3390/e20040290.
[4]
S. U. Rehman, S. Tu, Y. Huang, and O. U. Rehman, “A benchmark dataset and learning high-level semantic embeddings of multimedia for cross-media retrieval,” IEEE Access, vol. 6, pp. 67176–67188, 2018. 10.1109/ACCESS.2018.2878868.
[5]
S. U. Rehmanet al., “Unsupervised pre-trained filter learning approach for efficient convolution neural network,” Neurocomputing, vol. 365, pp. 171–190, Nov. 2019. 10.1016/j.neucom.2019.06.084.
[6]
J. Wang, T. Zhang, J. Song, N. Sebe, and H. T. Shen, “A survey on learning to hash,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 769–790, Apr. 2018. 10.1109/TPAMI.2017.2699960.
[7]
H. Lai, Y. Pan, S. Liu, Z. Weng, and J. Yin, “Improved search in Hamming space using deep multi-index hashing,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 9, pp. 2844–2855, Sep. 2019. 10.1109/TCSVT.2018.2869921.
[8]
H. Lai, J. Chen, L. Geng, Y. Pan, X. Liang, and J. Yin, “Improving deep binary embedding networks by order-aware reweighting of triplets,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 4, pp. 1162–1172, Apr. 2020. 10.1109/TCSVT.2019.2899055.
[9]
A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc. 47th Annu. IEEE Symp. Found. Comput. Sci. (FOCS), Berkeley, CA, Oct. 2006, 2006, pp. 459–468. 10.1109/FOCS.2006.49.
[10]
Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proc. 21st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, 2008, pp. 1753–1760. [Online]. Available: https://dl.acm.org/doi/10.5555/2981780.2981999
[11]
Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, Dec. 2013. 10.1109/TPAMI.2012.193.
[12]
J. Wang, S. Kumar, and S.-F. Chang, “Semi-supervised hashing for large-scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12, pp. 2393–2406, Dec. 2012. 10.1109/TPAMI.2012.48.
[13]
Z. Jin, C. Li, Y. Lin, and D. Cai, “Density sensitive hashing,” IEEE Trans. Cybern., vol. 44, no. 8, pp. 1362–1371, Aug. 2014. 10.1109/TCYB.2013.2283497.
[14]
J.-P. Heo, Y. Lee, J. He, S.-F. Chang, and S.-E. Yoon, “Spherical hashing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2012, pp. 2957–2964. 10.1109/CVPR.2012.6248024.
[15]
K. Lin, J. Lu, C.-S. Chen, and J. Zhou, “Learning compact binary descriptors with unsupervised deep neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 1183–1192. 10.1109/CVPR.2016.133.
[16]
B. Dai, R. Guo, S. Kumar, N. He, and L. Song, “Stochastic generative hashing,” in Proc. 34th Int. Conf. Mach. Learn., vol. 70, Sydney, NSW, Australia, Aug. 2017, pp. 913–922. [Online]. Available: https://dl.acm.org/doi/10.5555/3305381.3305476
[17]
E. Yang, C. Deng, T. Liu, W. Liu, and D. Tao, “Semantic structure-based unsupervised deep hashing,” in Proc. 27th Int. Joint Conf. Artif. Intell., Stockholm, Sweden, Jul. 2018, pp. 1064–1070.
[18]
J. Song, T. He, L. Gao, X. Xu, A. Hanjalic, and H. T. Shen, “Binary generative adversarial networks for image retrieval,” in Proc. 32nd AAAI Conf. Artif. Intell. 13th Innov. Appl. Artif. Intell. Conf. 8th AAAI Symp. Educ. Adv. Artif. Intell., New Orleans, LA, USA, Feb. 2018, pp. 394–401. [Online]. Available: https://dl.acm.org/doi/10.5555/3504035.3504084
[19]
Y. Shenet al., “Auto-encoding twin-bottleneck hashing,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 2815–2824. 10.1109/CVPR42600.2020.00289.
[20]
E. Yang, T. Liu, C. Deng, W. Liu, and D. Tao, “DistillHash: Unsupervised deep hashing by distilling data pairs,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 2941–2950. 10.1109/CVPR.2019.00306.
[21]
Q. Qin, L. Huang, Z. Wei, K. Xie, and W. Zhang, “Unsupervised deep multi-similarity hashing with semantic structure for image retrieval,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 7, pp. 2852–2865, Jul. 2021. 10.1109/TCSVT.2020.3032402.
[22]
R.-C. Tu, X.-L. Mao, and W. Wei, “MLS3RDUH: Deep unsupervised hashing via manifold based local semantic similarity structure reconstructing,” in Proc. 29th Int. Joint Conf. Artif. Intell., Yokohama, Japan, Jul. 2020, pp. 3466–3472. 10.24963/ijcai.2020/479.
[23]
Y. Li and J. Van Gemert, “Deep unsupervised image hashing by maximizing bit entropy,” in Proc. AAAI Conf. Artif. Intell., May 2021, vol. 35, no. 3, pp. 2002–2010. 10.1609/aaai.v35i3.16296.
[24]
W. Zhang, D. Wu, Y. Zhou, B. Li, W. Wang, and D. Meng, “Deep unsupervised hybrid-similarity Hadamard hashing,” in Proc. 28th ACM Int. Conf. Multimedia, New York, NY, USA, Oct. 2020, pp. 3274–3282. 10.1145/3394171.3414028.
[25]
X. Luoet al., “CIMON: Towards high-quality hash codes,” in Proc. 13th Int. Joint Conf. Artif. Intell., Aug. 2021, pp. 902–908. 10.24963/ijcai.2021/125.
[26]
X. Luoet al., “A statistical approach to mining semantic similarity for deep unsupervised hashing,” in Proc. 29th ACM Int. Conf. Multimedia, New York, NY, USA, Oct. 2021, pp. 4306–4314. 10.1145/3474085.3475570.
[27]
W. Kong and W.-J. Li, “Double-bit quantization for hashing,” in Proc. 26th AAAI Conf. Artif. Intell., Toronto, ON, Canada, Jul. 2012, pp. 634–640. [Online]. Available: https://dl.acm.org/doi/10.5555/2900728.2900819
[28]
C. Deng, H. Deng, X. Liu, and Y. Yuan, “Adaptive multi-bit quantization for hashing,” Neurocomputing, vol. 151, pp. 319–326, Mar. 2015. 10.1016/j.neucom.2014.09.033.
[29]
S. Wang and C. Li, “Discrete double-bit hashing,” IEEE Trans. Big Data, vol. 8, no. 2, pp. 482–494, Apr. 2022. 10.1109/TBDATA.2019.2946616.
[30]
H. Liu, R. Wang, S. Shan, and X. Chen, “Deep supervised hashing for fast image retrieval,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2064–2072. 10.1109/CVPR.2016.227.
[31]
Q. Li, Z. Sun, R. He, and T. Tan, “Deep supervised discrete hashing,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, Dec. 2017, pp. 2479–2488. [Online]. Available: https://dl.acm.org/doi/10.5555/3294996.3295009
[32]
Y. Cao, M. Long, B. Liu, and J. Wang, “Deep Cauchy hashing for Hamming space retrieval,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 1229–1237. 10.1109/CVPR.2018.00134.
[33]
A. Vaswaniet al., “Attention is all you need,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, Dec. 2017, pp. 6000–6010, [Online]. Available: https://dl.acm.org/doi/10.5555/3295222.3295349
[34]
A. Dosovitskiyet al., “An image is worth 16×16 words: Transformers for image recognition at scale,” Jun. 2020, arXiv:2010.11929.
[35]
J. Xu, X. Sun, Z. Zhang, G. Zhao, and J. Lin, “Understanding and improving layer normalization,” in Proc. 33rd Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA: Curran Associates, 2019, pp. 4381–4391. [Online]. Available: https://dl.acm.org/doi/10.5555/3454287.3454681
[36]
A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, Dec. 2017, pp. 1195–1204. [Online]. Available: https://dl.acm.org/doi/10.5555/3294771.3294885
[37]
T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. Hoboken, NJ, USA: Wiley, 2006.
[38]
C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, no. 3, pp. 379–423, Jul. 1948. 10.1002/j.1538-7305.1948.tb01338.x.
[39]
C. Villani, Topics in Optimal Transportation. Providence, RI, USA: American Mathematical Society, 2003.
[40]
W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with graphs,” in Proc. 28th Int. Conf. Int. Conf. Mach. Learn., Madison, WI, USA, Jun. 2011, pp. 1–8. [Online]. Available: https://dl.acm.org/doi/10.5555/3104482.3104483
[41]
K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 9726–9735. 10.1109/CVPR42600.2020.00975.
[42]
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. 37th Int. Conf. Mach. Learn., Jul. 2020, pp. 1597–1607. [Online]. Available: https://dl.acm.org/doi//10.5555/3524938.3525087
[43]
A. Krizhevsky, V. Nair, and G. Hinton CIFAR-10 (Canadian Institute for Advanced Research). Accessed: Dec. 20, 2021. [Online]. Available: https://www.cs.toronto.edu/~triz/cifar.html
[44]
T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng, “NUS-WIDE: A real-world web image database from National University of Singapore,” in Proc. ACM Int. Conf. Image Video Retr., New York, NY, USA, Jul. 2009, pp. 1–9. 10.1145/1646396.1646452.
[45]
M. J. Huiskes and M. S. Lew, “The MIR Flickr retrieval evaluation,” in Proc. 1st ACM Int. Conf. Multimedia Inf. Retr., New York, NY, USA, Oct. 2008, pp. 39–43. 10.1145/1460096.1460104.
[46]
H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jegou, “Training data-efficient image transformers & distillation through attention,” Jan. 2021, arXiv:2012.12877.
[47]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2009, pp. 248–255. 10.1109/CVPR.2009.5206848.
[48]
D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),” Jul. 2016, arXiv:1606.08415.
[49]
A. Paszkeet al., “PyTorch: An imperative style, high-performance deep learning library,” in Proc. 33rd Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA: Curran Associates, 2019, pp. 8026–8037. [Online]. Available: https://dl.acm.org/doi/10.5555/3454287.3455008
[50]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. 25th Int. Conf. Neural Inf. Process. Syst., vol. 1, Red Hook, NY, USA, Dec. 2012, pp. 1097–1105. 10.1145/3065386.
[51]
L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, 2008. [Online]. Available: https://jmlr.org/papers/v9/vandermaaten08a.html

Cited By

View all
  • (2024)Disperse Asymmetric Subspace Relation Hashing for Cross-Modal RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328730134:1(603-617)Online publication date: 1-Jan-2024

Index Terms

  1. Deep Learning-Based Image Retrieval With Unsupervised Double Bit Hashing
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Circuits and Systems for Video Technology
        IEEE Transactions on Circuits and Systems for Video Technology  Volume 33, Issue 11
        Nov. 2023
        880 pages

        Publisher

        IEEE Press

        Publication History

        Published: 18 April 2023

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 18 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Disperse Asymmetric Subspace Relation Hashing for Cross-Modal RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328730134:1(603-617)Online publication date: 1-Jan-2024

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media