More Web Proxy on the site http://driver.im/

research-article

Deep Learning-Based Image Retrieval With Unsupervised Double Bit Hashing

Authors:

Alim Wicaksono Hari Prayuda,

Sankarasrinivasan SeshathiriAuthors Info & Claims

IEEE Transactions on Circuits and Systems for Video Technology, Volume 33, Issue 11

Pages 7050 - 7065

https://doi.org/10.1109/TCSVT.2023.3268091

Published: 18 April 2023 Publication History

Abstract

Unsupervised image hashing is a widely used technique for large-scale image retrieval. This technique maps an image to a finite length of binary codes without extensive human-annotated data for compact storage and effective semantic retrieval. This study proposes a novel deep unsupervised double-bit hashing method for image retrieval. This approach is based on the double-bit hashing method, which has been shown to better preserve the neighboring structure of binary codes than single-bit hashing. Traditional double-bit hashing methods require the entire dataset to be processed simultaneously to determine optimal thresholding values of binary feature encoding. In contrast, the proposed method trains the hashing layer in a minibatch manner, allowing for adaptive threshold learning through a gradient-based optimization strategy. Additionally, unlike most former methods, which only train the hashing networks on top of fixed pre-trained neural networks backbone. The proposed learning framework trains both hashing and backbone networks alternately asynchronously. This strategy enables the model to maximize the learning capability of the hashing and backbone networks. Furthermore, adopting the lightweight Vision Transformer (ViT) in the proposed method allows the model to capture both local and global relationships between multiple image views exemplar, which lead to better generalization, thus maximizing the retrieval performance of the model. Extensive experiments on CIFAR10, NUS-WIDE, and FLICKR25K datasets validate that the proposed method has superior retrieval quality and computational efficiency than state-of-the-art methods.

References

[1]

S. U. Rehman, S. Tu, Y. Huang, and Z. Yang, “Face recognition: A novel un-supervised convolutional neural network method,” in Proc. IEEE Int. Conf. Online Anal. Comput. Sci. (ICOACS), May 2016, pp. 139–144. 10.1109/ICOACS.2016.7563066.

[2]

S. U. Rehman, S. Tu, Y. Huang, and G. Liu, “CSFL: A novel unsupervised convolution neural network approach for visual pattern classification,” AI Commun., vol. 30, pp. 311–324, Jan. 2017. 10.3233/AIC-170739.

Digital Library

[3]

S. Rehman, S. Tu, O. Rehman, Y. Huang, C. Magurawalage, and C.-C. Chang, “Optimization of CNN through novel training strategy for visual classification problems,” Entropy, vol. 20, no. 4, p. 290, Apr. 2018. 10.3390/e20040290.

[4]

S. U. Rehman, S. Tu, Y. Huang, and O. U. Rehman, “A benchmark dataset and learning high-level semantic embeddings of multimedia for cross-media retrieval,” IEEE Access, vol. 6, pp. 67176–67188, 2018. 10.1109/ACCESS.2018.2878868.

[5]

S. U. Rehmanet al., “Unsupervised pre-trained filter learning approach for efficient convolution neural network,” Neurocomputing, vol. 365, pp. 171–190, Nov. 2019. 10.1016/j.neucom.2019.06.084.

Digital Library

[6]

J. Wang, T. Zhang, J. Song, N. Sebe, and H. T. Shen, “A survey on learning to hash,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 769–790, Apr. 2018. 10.1109/TPAMI.2017.2699960.

[7]

H. Lai, Y. Pan, S. Liu, Z. Weng, and J. Yin, “Improved search in Hamming space using deep multi-index hashing,” IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 9, pp. 2844–2855, Sep. 2019. 10.1109/TCSVT.2018.2869921.

Digital Library

[8]

H. Lai, J. Chen, L. Geng, Y. Pan, X. Liang, and J. Yin, “Improving deep binary embedding networks by order-aware reweighting of triplets,” IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 4, pp. 1162–1172, Apr. 2020. 10.1109/TCSVT.2019.2899055.

[9]

A. Andoni and P. Indyk, “Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions,” in Proc. 47th Annu. IEEE Symp. Found. Comput. Sci. (FOCS), Berkeley, CA, Oct. 2006, 2006, pp. 459–468. 10.1109/FOCS.2006.49.

Digital Library

[10]

Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proc. 21st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, 2008, pp. 1753–1760. [Online]. Available: https://dl.acm.org/doi/10.5555/2981780.2981999

[11]

Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, Dec. 2013. 10.1109/TPAMI.2012.193.

Digital Library

[12]

J. Wang, S. Kumar, and S.-F. Chang, “Semi-supervised hashing for large-scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12, pp. 2393–2406, Dec. 2012. 10.1109/TPAMI.2012.48.

Digital Library

[13]

Z. Jin, C. Li, Y. Lin, and D. Cai, “Density sensitive hashing,” IEEE Trans. Cybern., vol. 44, no. 8, pp. 1362–1371, Aug. 2014. 10.1109/TCYB.2013.2283497.

[14]

J.-P. Heo, Y. Lee, J. He, S.-F. Chang, and S.-E. Yoon, “Spherical hashing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2012, pp. 2957–2964. 10.1109/CVPR.2012.6248024.

[15]

K. Lin, J. Lu, C.-S. Chen, and J. Zhou, “Learning compact binary descriptors with unsupervised deep neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 1183–1192. 10.1109/CVPR.2016.133.

[16]

B. Dai, R. Guo, S. Kumar, N. He, and L. Song, “Stochastic generative hashing,” in Proc. 34th Int. Conf. Mach. Learn., vol. 70, Sydney, NSW, Australia, Aug. 2017, pp. 913–922. [Online]. Available: https://dl.acm.org/doi/10.5555/3305381.3305476

[17]

E. Yang, C. Deng, T. Liu, W. Liu, and D. Tao, “Semantic structure-based unsupervised deep hashing,” in Proc. 27th Int. Joint Conf. Artif. Intell., Stockholm, Sweden, Jul. 2018, pp. 1064–1070.

[18]

J. Song, T. He, L. Gao, X. Xu, A. Hanjalic, and H. T. Shen, “Binary generative adversarial networks for image retrieval,” in Proc. 32nd AAAI Conf. Artif. Intell. 13th Innov. Appl. Artif. Intell. Conf. 8th AAAI Symp. Educ. Adv. Artif. Intell., New Orleans, LA, USA, Feb. 2018, pp. 394–401. [Online]. Available: https://dl.acm.org/doi/10.5555/3504035.3504084

[19]

Y. Shenet al., “Auto-encoding twin-bottleneck hashing,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 2815–2824. 10.1109/CVPR42600.2020.00289.

[20]

E. Yang, T. Liu, C. Deng, W. Liu, and D. Tao, “DistillHash: Unsupervised deep hashing by distilling data pairs,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 2941–2950. 10.1109/CVPR.2019.00306.

[21]

Q. Qin, L. Huang, Z. Wei, K. Xie, and W. Zhang, “Unsupervised deep multi-similarity hashing with semantic structure for image retrieval,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 7, pp. 2852–2865, Jul. 2021. 10.1109/TCSVT.2020.3032402.

[22]

R.-C. Tu, X.-L. Mao, and W. Wei, “MLS3RDUH: Deep unsupervised hashing via manifold based local semantic similarity structure reconstructing,” in Proc. 29th Int. Joint Conf. Artif. Intell., Yokohama, Japan, Jul. 2020, pp. 3466–3472. 10.24963/ijcai.2020/479.

[23]

Y. Li and J. Van Gemert, “Deep unsupervised image hashing by maximizing bit entropy,” in Proc. AAAI Conf. Artif. Intell., May 2021, vol. 35, no. 3, pp. 2002–2010. 10.1609/aaai.v35i3.16296.

[24]

W. Zhang, D. Wu, Y. Zhou, B. Li, W. Wang, and D. Meng, “Deep unsupervised hybrid-similarity Hadamard hashing,” in Proc. 28th ACM Int. Conf. Multimedia, New York, NY, USA, Oct. 2020, pp. 3274–3282. 10.1145/3394171.3414028.

Digital Library

[25]

X. Luoet al., “CIMON: Towards high-quality hash codes,” in Proc. 13th Int. Joint Conf. Artif. Intell., Aug. 2021, pp. 902–908. 10.24963/ijcai.2021/125.

[26]

X. Luoet al., “A statistical approach to mining semantic similarity for deep unsupervised hashing,” in Proc. 29th ACM Int. Conf. Multimedia, New York, NY, USA, Oct. 2021, pp. 4306–4314. 10.1145/3474085.3475570.

Digital Library

[27]

W. Kong and W.-J. Li, “Double-bit quantization for hashing,” in Proc. 26th AAAI Conf. Artif. Intell., Toronto, ON, Canada, Jul. 2012, pp. 634–640. [Online]. Available: https://dl.acm.org/doi/10.5555/2900728.2900819

[28]

C. Deng, H. Deng, X. Liu, and Y. Yuan, “Adaptive multi-bit quantization for hashing,” Neurocomputing, vol. 151, pp. 319–326, Mar. 2015. 10.1016/j.neucom.2014.09.033.

[29]

S. Wang and C. Li, “Discrete double-bit hashing,” IEEE Trans. Big Data, vol. 8, no. 2, pp. 482–494, Apr. 2022. 10.1109/TBDATA.2019.2946616.

[30]

H. Liu, R. Wang, S. Shan, and X. Chen, “Deep supervised hashing for fast image retrieval,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2064–2072. 10.1109/CVPR.2016.227.

[31]

Q. Li, Z. Sun, R. He, and T. Tan, “Deep supervised discrete hashing,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, Dec. 2017, pp. 2479–2488. [Online]. Available: https://dl.acm.org/doi/10.5555/3294996.3295009

[32]

Y. Cao, M. Long, B. Liu, and J. Wang, “Deep Cauchy hashing for Hamming space retrieval,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 1229–1237. 10.1109/CVPR.2018.00134.

[33]

A. Vaswaniet al., “Attention is all you need,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, Dec. 2017, pp. 6000–6010, [Online]. Available: https://dl.acm.org/doi/10.5555/3295222.3295349

[34]

A. Dosovitskiyet al., “An image is worth 16×16 words: Transformers for image recognition at scale,” Jun. 2020, arXiv:2010.11929.

[35]

J. Xu, X. Sun, Z. Zhang, G. Zhao, and J. Lin, “Understanding and improving layer normalization,” in Proc. 33rd Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA: Curran Associates, 2019, pp. 4381–4391. [Online]. Available: https://dl.acm.org/doi/10.5555/3454287.3454681

[36]

A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA, Dec. 2017, pp. 1195–1204. [Online]. Available: https://dl.acm.org/doi/10.5555/3294771.3294885

[37]

T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. Hoboken, NJ, USA: Wiley, 2006.

Digital Library

[38]

C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, no. 3, pp. 379–423, Jul. 1948. 10.1002/j.1538-7305.1948.tb01338.x.

[39]

C. Villani, Topics in Optimal Transportation. Providence, RI, USA: American Mathematical Society, 2003.

[40]

W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with graphs,” in Proc. 28th Int. Conf. Int. Conf. Mach. Learn., Madison, WI, USA, Jun. 2011, pp. 1–8. [Online]. Available: https://dl.acm.org/doi/10.5555/3104482.3104483

[41]

K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 9726–9735. 10.1109/CVPR42600.2020.00975.

[42]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proc. 37th Int. Conf. Mach. Learn., Jul. 2020, pp. 1597–1607. [Online]. Available: https://dl.acm.org/doi//10.5555/3524938.3525087

[43]

A. Krizhevsky, V. Nair, and G. Hinton CIFAR-10 (Canadian Institute for Advanced Research). Accessed: Dec. 20, 2021. [Online]. Available: https://www.cs.toronto.edu/~triz/cifar.html

[44]

T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng, “NUS-WIDE: A real-world web image database from National University of Singapore,” in Proc. ACM Int. Conf. Image Video Retr., New York, NY, USA, Jul. 2009, pp. 1–9. 10.1145/1646396.1646452.

Digital Library

[45]

M. J. Huiskes and M. S. Lew, “The MIR Flickr retrieval evaluation,” in Proc. 1st ACM Int. Conf. Multimedia Inf. Retr., New York, NY, USA, Oct. 2008, pp. 39–43. 10.1145/1460096.1460104.

Digital Library

[46]

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jegou, “Training data-efficient image transformers & distillation through attention,” Jan. 2021, arXiv:2012.12877.

[47]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2009, pp. 248–255. 10.1109/CVPR.2009.5206848.

[48]

D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),” Jul. 2016, arXiv:1606.08415.

[49]

A. Paszkeet al., “PyTorch: An imperative style, high-performance deep learning library,” in Proc. 33rd Int. Conf. Neural Inf. Process. Syst., Red Hook, NY, USA: Curran Associates, 2019, pp. 8026–8037. [Online]. Available: https://dl.acm.org/doi/10.5555/3454287.3455008

[50]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. 25th Int. Conf. Neural Inf. Process. Syst., vol. 1, Red Hook, NY, USA, Dec. 2012, pp. 1097–1105. 10.1145/3065386.

Digital Library

[51]

L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, 2008. [Online]. Available: https://jmlr.org/papers/v9/vandermaaten08a.html

Cited By

Yang FHan MMa FLiu YDing XTong D(2024)Disperse Asymmetric Subspace Relation Hashing for Cross-Modal RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328730134:1(603-617)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TCSVT.2023.3287301

Index Terms

Deep Learning-Based Image Retrieval With Unsupervised Double Bit Hashing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Index terms have been assigned to the content through auto-classification.

Recommendations

Pseudo Label based Unsupervised Deep Discriminative Hashing for Image Retrieval
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Hashing methods play an important role in large scale image retrieval. Traditional hashing methods use hand-crafted features to learn hash functions, which can not capture the high level semantic information. Deep hashing algorithms use deep neural ...
Unsupervised Triplet Hashing for Fast Image Retrieval
Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017

The explosive growth of multimedia contents has made hashing an indispensable component in image retrieval. In particular, learning-based hashing has recently shown great promising with the advance of Convolutional Neural Network (CNN). However, the ...
Supervised deep hashing for scalable face image retrieval

We propose a novel Deep Hashing based on Classification and Quantization errors for face image retrieval.It jointly learns feature representations of images, hashing functions and classifiers.The quantization errors and the prediction errors jointly ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Circuits and Systems for Video Technology

IEEE Transactions on Circuits and Systems for Video Technology Volume 33, Issue 11

Nov. 2023

880 pages

ISSN:1051-8215

Issue’s Table of Contents

1051-8215 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 18 April 2023

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang FHan MMa FLiu YDing XTong D(2024)Disperse Asymmetric Subspace Relation Hashing for Cross-Modal RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328730134:1(603-617)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TCSVT.2023.3287301

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents