Abstract
Cross-modal hashing has attracted more and more research interest for its high speed and low storage cost in solving cross-modal approximate nearest neighbor search problem. With the rapid growth of social networks, a large amount of information is generated every day, which inevitably contains some noisy information. However, most existing cross-modal hashing methods do not take label noise into consideration and simply assume that all training data are completely reliable. Therefore, these methods will be affected by label noise and reduce the effectiveness on the real-world data. In this paper, we propose a novel end-to-end cross-modal hashing method called label noise robust cross-modal hashing (LNRCMH) to solve the cross-modal hashing problem on data with label noise. LNRCMH first calculates the local outlier factor (LOF) for each instance to evaluate the probability that the instance is corrupted by the label noise. Then LNRCMH assigns lower weights to the instances with high probabilities to be corrupted. Finally, LNRCMH uses different neural networks to learn features for instances from different modalities and transforms them into binary hash codes. Our experimental results on multi-modal benchmark datasets demonstrate that LNRCMH performs significantly better than other related and comparable methods with noisy label annotations. Our approach also achieves competitive results in noise-free situations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3594–3601 (2010)
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference, pp. 1–12 (2014)
Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of Singapore. In: International Conference on Image and Video Retrieval, p. 48 (2009)
Ding, G., Guo, Y., Zhou, J.: Collective matrix factorization hashing for multimodal data. In: CVPR, pp. 2075–2082 (2014)
Gai, K., Qiu, M.: Reinforcement learning-based content-centric services in mobile sensing. IEEE Network, pp. 34–39 (2018)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Huiskes, M.J., Lew, M.S.: The mir flickr retrieval evaluation. In: ACM International Conference on Multimedia Information Retrieval, pp. 39–43 (2008)
Jiang, Q., Li, W.: Deep cross-modal hashing. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3232–3240 (2017)
Lin, Z., Ding, G., Hu, M., Wang, J.: Semantics-preserving hashing for cross-view retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3864–3872 (2015)
Liu, X., Yu, G., Domeniconi, C., Wang, J., Ren, Y., Guo, M.: Ranking-based deep cross-modal hashing. AAAI, pp. 4400–4407 (2019)
Meng, M., Wang, H., Yu, J., Chen, H., Wu, J.: Asymmetric supervised consistent and specific hashing for cross-modal retrieval. TIP 30, 986–1000 (2020)
Su, S., Zhong, Z., Zhang, C.: Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval, pp. 3027–3035 (2019)
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Wang, J., Zhang, T., Sebe, N., Shen, H.T., et al.: A survey on learning to hash. IEEE TPAMI 40(4), 769–790 (2017)
Yan, C., Bai, X., Wang, S., Zhou, J., Hancock, E.R.: Cross-modal hashing with semantic deep embedding. Neurocomputing 337, 58–66 (2019)
Yang, E., Deng, C., Liu, W., Liu, X., Tao, D., Gao, X.: Pairwise relationship guided deep hashing for cross-modal retrieval. In: AAAI Conference on Artificial Intelligence, pp. 1618–1625 (2017)
Zhang, D., Li, W.: Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI Conference on Artificial Intelligence, pp. 2177–2183 (2014)
Zhen, Y., Yeung, D.Y.: A probabilistic model for multimodal hash function learning. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 940–948 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, R., Yang, Y., Han, G. (2021). A Label Noise Robust Cross-Modal Hashing Approach. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, SY. (eds) Knowledge Science, Engineering and Management . KSEM 2021. Lecture Notes in Computer Science(), vol 12816. Springer, Cham. https://doi.org/10.1007/978-3-030-82147-0_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-82147-0_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82146-3
Online ISBN: 978-3-030-82147-0
eBook Packages: Computer ScienceComputer Science (R0)