Abstract
Semantic hashing is a technique to represent high-dimensional data using similarity-preserving binary codes for efficient indexing and search. Recently, variational autoencoders with Bernoulli latent representations achieved remarkable success in learning such codes in supervised and unsupervised scenarios, outperforming traditional methods thanks to their ability to handle the binary constraints architecturally.
In this paper, we propose a novel method for supervision (self-supervised) of variational autoencoders where the model uses its own predictions of the label distribution to implement the pairwise objective function. Also, we investigate the robustness of hashing methods based on variational autoencoders to the lack of supervision, focusing on two semi-supervised approaches currently in use. Our experiments on text and image retrieval tasks show that, as expected, both methods can significantly increase the quality of the hash codes as the number of labelled observations increases, but deteriorates when the amount of labelled samples decreases. In this scenario, the proposed self-supervised approach outperforms the classical approaches and yields similar performance in fully-supervised settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A high value in this function means that the objects are more similar.
- 2.
Our code is made publicly available at https://github.com/amacaluso/SSB-VAE.
- 3.
A draft version of this work is available on arXiv:2007.08799.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM, New York (1999)
Carreira-Perpinán, M.A., Raziperchikolaei, R.: Hashing with binary autoencoders. In: Proceedings of the CVPR, pp. 557–566 (2015)
Chaidaroon, S., Fang, Y.: Variational deep semantic hashing for text documents. In: SIGIR, pp. 75–84 (2017)
Dadaneh, S.Z., Boluki, S., Yin, M., Zhou, M., Qian, X.: Pairwise supervised hashing with Bernoulli variational auto-encoder and self-control gradient estimator. In: Proceedings of the UAI (2020)
Do, T.-T., Doan, A.-D., Cheung, N.-M.: Learning to hash with binary deep neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part V. LNCS, vol. 9909, pp. 219–234. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_14
Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: Proceedings of the CVPR, pp. 817–824 (2011)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the ACM STOC, pp. 604–613 (1998)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. In: Proceedings of the ICLR (2017)
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the ICLR (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the CVPR, pp. 3270–3278 (2015)
Lu, J., Liong, V.E., Zhou, J.: Deep hashing for scalable image search. IEEE Trans. Image Process. 26(5), 2352–2367 (2017)
Mena, F., \(\tilde{\text{N}}\)anculef, R.: A binary variational autoencoder for hashing. In: Nyström, I., Hernández Heredia, Y., Milián Núñez, V. (eds.) CIARP 2019. LNCS, vol. 11896, pp. 131–141. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33904-3_12
Norouzi, M., Punjani, A., Fleet, D.J.: Fast exact search in Hamming space with multi-index hashing. IEEE Pattern Anal. Mach. Intell. 36(6), 1107–1119 (2014)
Song, T., Cai, J., Zhang, T., Gao, C., Meng, F., Wu, Q.: Semi-supervised manifold-embedded hashing with joint feature representation and classifier learning. Pattern Recognit. 68, 99–110 (2017)
Triguero, I., García, S., Herrera, F.: Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl. Inf. Syst. 42(2), 245–284 (2015). https://doi.org/10.1007/s10115-013-0706-y
Wang, J., Kumar, S., Chang, S.F.: Semi-supervised hashing for large-scale search. IEEE Pattern Anal. Mach. Intell. 34(12), 2393–2406 (2012)
Wang, Q., Zhang, D., Si, L.: Semantic hashing using tags and topic modeling. In: Proceedings of the SIGIR, pp. 213–222. ACM (2013)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS (2009)
Yang, H., Tu, C., Chen, C.: Adaptive labeling for hash code learning via neural networks. In: Proceedings of the ICIP, pp. 2244–2248 (2019)
Zhang, D., Wang, J., Cai, D., Lu, J.: Self-taught hashing for fast similarity search. In: Proceedings of the SIGIR, pp. 18–25 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Ñanculef, R., Mena, F., Macaluso, A., Lodi, S., Sartori, C. (2021). Self-supervised Bernoulli Autoencoders for Semi-supervised Hashing. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-93420-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93419-4
Online ISBN: 978-3-030-93420-0
eBook Packages: Computer ScienceComputer Science (R0)