[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3331184.3331217acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Online Multi-modal Hashing with Dynamic Query-adaption

Published: 18 July 2019 Publication History

Abstract

Multi-modal hashing is an effective technique to support large-scale multimedia retrieval, due to its capability of encoding heterogeneous multi-modal features into compact and similarity-preserving binary codes. Although great progress has been achieved so far, existing methods still suffer from several problems, including: 1) All existing methods simply adopt fixed modality combination weights in online hashing process to generate the query hash codes. This strategy cannot adaptively capture the variations of different queries. 2) They either suffer from insufficient semantics (for unsupervised methods) or require high computation and storage cost (for the supervised methods, which rely on pair-wise semantic matrix). 3) They solve the hash codes with relaxed optimization strategy or bit-by-bit discrete optimization, which results in significant quantization loss or consumes considerable computation time. To address the above limitations, in this paper, we propose an Online Multi-modal Hashing with Dynamic Query-adaption (OMH-DQ) method in a novel fashion. Specifically, a self-weighted fusion strategy is designed to adaptively preserve the multi-modal feature information into hash codes by exploiting their complementarity. The hash codes are learned with the supervision of pair-wise semantic labels to enhance their discriminative capability, while avoiding the challenging symmetric similarity matrix factorization. Under such learning framework, the binary hash codes can be directly obtained with efficient operations and without quantization errors. Accordingly, our method can benefit from the semantic labels, and simultaneously, avoid the high computation complexity. Moreover, to accurately capture the query variations, at the online retrieval stage, we design a parameter-free online hashing module which can adaptively learn the query hash codes according to the dynamic query contents. Extensive experiments demonstrate the state-of-the-art performance of the proposed approach from various aspects.

Supplementary Material

MP4 File (cite4-14h30-d3.mp4)

References

[1]
Micael Carvalho, Rémi Cadène, David Picard, Laure Soulier, Nicolas Thome, and Matthieu Cord. 2018. Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings. In SIGIR. 135--44.
[2]
Suthee Chaidaroon, Travis Ebesu, and Yi Fang. 2018. Deep Semantic Text Hashing with Weak Supervision. In SIGIR. 1109--1112.
[3]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In CIVR. 48.
[4]
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In CVPR. 2075--2082.
[5]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. TPAMI, Vol. 35, 12 (2013), 2916--2929.
[6]
Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In SIGMM. 39--43.
[7]
Saehoon Kim and Seungjin Choi. 2013. Multi-view anchor graph hashing. In ICASSP. 3123--3127.
[8]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In ECCV. 740--755.
[9]
Zhouchen Lin, Minming Chen, and Yi Ma. 2010. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:1009.5055 (2010).
[10]
Zijia Lin, Guiguang Ding, Jungong Han, and Jianmin Wang. 2017. Cross-view retrieval via probability-based semantics-preserving hashing. TCYB, Vol. 47, 12 (2017), 4342--4355.
[11]
Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, and Hanwang Zhang. 2018. Discrete Factorization Machines for Fast Feature-based Recommendation. In IJCAI. 3449--3455.
[12]
Li Liu, Mengyang Yu, and Ling Shao. 2015. Multiview alignment hashing for efficient image search. TIP, Vol. 24, 3 (2015), 956--966.
[13]
Xianglong Liu, Junfeng He, Di Liu, and Bo Lang. 2012. Compact kernel hashing with multiple features. In ACM MM. 881--884.
[14]
Fuchen Long, Ting Yao, Qi Dai, Xinmei Tian, Jiebo Luo, and Tao Mei. 2018. Deep Domain Adaptation Hashing with Adversarial Learning. In SIGIR. 725--734.
[15]
Katta G. Murty. 2013. Nonlinear Programming: Theory and Algorithms 3rd ed.). Wiley Publishing.
[16]
Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep Asymmetric Pairwise Hashing. In ACM MM. 1522--1530.
[17]
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In CVPR. 37--45.
[18]
Xiaobo Shen, Funmin Shen, Liliu, Yunhao Yuan, Weiwei Liu, and Quansen Sun. 2018. Multiview Discrete Hashing for Scalable Multimedia Search. ACM TIST, Vol. 9, 5 (2018), 53.
[19]
Xiaobo Shen, Fumin Shen, Quan-Sen Sun, and Yunhao Yuan. 2015. Multi-view latent hashing for efficient multimedia search. In ACM MM. 831--834.
[20]
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, Vol. abs/1409.1556 (2014).
[21]
Jingkuan Song, Yi Yang, Zi Huang, Heng Tao Shen, and Jiebo Luo. 2013. Effective multiple feature hashing for large-scale near-duplicate video retrieval. TMM, Vol. 15, 8 (2013), 1997--2008.
[22]
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In SIGMOD. 785--796.
[23]
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2018. Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search. TPAMI (2018).
[24]
Jingdong Wang, Ting Zhang, Jingkuan Song, Nicu Sebe, and Heng Tao Shen. 2018. A Survey on Learning to Hash. TPAMI, Vol. 40, 4 (2018), 769--790.
[25]
Liang Xie, Jialie Shen, Jungong Han, Lei Zhu, and Ling Shao. 2017. Dynamic Multi-View Hashing for Online Image Retrieval. In IJCAI. 3133--3139.
[26]
Liang Xie, Jialie Shen, and Lei Zhu. 2016. Online Cross-Modal Hashing for Web Image Retrieval. In AAAI. 294--300.
[27]
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval. In AAAI. 1618--1625.
[28]
Rui Yang, Yuliang Shi, and Xin-Shun Xu. 2017. Discrete Multi-view Hashing for Effective Image Retrieval. In ICMR. 175--783.
[29]
Dongqing Zhang and Wu-Jun Li. 2014. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization. In AAAI. 2177--2183.
[30]
Dan Zhang, Fei Wang, and Luo Si. 2011. Composite hashing with multiple information sources. In SIGIR. 225--234.
[31]
Hanwang Zhang, Fumin Shen, Wei Liu, Xiangnan He, Huanbo Luan, and Tat-Seng Chua. 2016. Discrete Collaborative Filtering. In SIGIR. 325--334.
[32]
Hanwang Zhang, Meng Wang, Richang Hong, and Tat-Seng Chua. 2016. Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing. In MM. 781--790.
[33]
Hanwang Zhang, Na Zhao, Xindi Shang, Huan-Bo Luan, and Tat-Seng Chua. 2016. Discrete Image Hashing Using Large Weakly Annotated Photo Collections. In AAAI. 3669--3675.
[34]
Peichao Zhang, Wei Zhang, Wu-Jun Li, and Minyi Guo. 2014. Supervised hashing with latent factor models. In SIGIR. 173--182.
[35]
Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, and Shuicheng Yan. 2017. HashGAN: Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval. CoRR, Vol. abs/1711.09347 (2017).
[36]
Han Zhu, Mingsheng Long, Jianmin Wang, and Yue Cao. 2016. Deep Hashing Network for Efficient Similarity Retrieval. In AAAI. 2415--2421.
[37]
Lei Zhu, Zi Huang, Xiaojun Chang, Jingkuan Song, and Heng Tao Shen. 2017. Exploring Consistent Preferences: Discrete Hashing with Pair-Exemplar for Scalable Landmark Search. In MM. 726--734.
[38]
Lei Zhu, Zi Huang, Zhihui Li, Liang Xie, and Heng Tao Shen. 2018. Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval. TNNLS, Vol. 29, 11 (2018), 5264--5276.
[39]
Lei Zhu, Jialie Shen, Xiaobai Liu, Liang Xie, and Liqiang Nie. 2016. Learning Compact Visual Representation with Canonical Views for Robust Mobile Landmark Search. In IJCAI. 3959--3967.
[40]
Lei Zhu, Jialie Shen, Liang Xie, and Zhiyong Cheng. 2017. Unsupervised visual hashing with semantic assistant for content-based image retrieval. TKDE, Vol. 29, 2 (2017), 472--486.

Cited By

View all
  • (2025)Deep multi-similarity hashing via label-guided network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128830616(128830)Online publication date: Feb-2025
  • (2024)Fast Unsupervised Cross-Modal Hashing with Robust Factorization and Dual ProjectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369468420:12(1-21)Online publication date: 26-Nov-2024
  • (2024)Learning Domain Invariant Features for Unsupervised Indoor Depth Estimation AdaptationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367239720:9(1-23)Online publication date: 13-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2019
1512 pages
ISBN:9781450361729
DOI:10.1145/3331184
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic query-adaption
  2. efficient discrete optimization
  3. online multi-modal hashing
  4. self-weighted

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • the Key Research and Development Foundation of Shandong Province
  • the Natural Science Foundation of Shandong China

Conference

SIGIR '19
Sponsor:

Acceptance Rates

SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)5
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Deep multi-similarity hashing via label-guided network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128830616(128830)Online publication date: Feb-2025
  • (2024)Fast Unsupervised Cross-Modal Hashing with Robust Factorization and Dual ProjectionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369468420:12(1-21)Online publication date: 26-Nov-2024
  • (2024)Learning Domain Invariant Features for Unsupervised Indoor Depth Estimation AdaptationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367239720:9(1-23)Online publication date: 13-Jun-2024
  • (2024)Online Cross-modal Hashing With Dynamic PrototypeACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366524920:8(1-18)Online publication date: 13-Jun-2024
  • (2024)FedCAFE: Federated Cross-Modal Hashing with Adaptive Feature EnhancementProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681319(9670-9679)Online publication date: 28-Oct-2024
  • (2024)Deep Neighborhood-aware Proxy Hashing with Uniform Distribution Constraint for Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364363920:6(1-23)Online publication date: 27-Jan-2024
  • (2024)Efficient Cross-Modal Video Retrieval With Meta-Optimized FramesIEEE Transactions on Multimedia10.1109/TMM.2024.341666926(10924-10936)Online publication date: 2024
  • (2024)Unsupervised Dual Hashing Coding (UDC) on Semantic Tagging and Sample Content for Cross-Modal RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.338598626(9109-9120)Online publication date: 2024
  • (2024)Multi-Facet Weighted Asymmetric Multi-Modal Hashing Based on Latent Semantic DistributionIEEE Transactions on Multimedia10.1109/TMM.2024.336366426(7307-7320)Online publication date: 2024
  • (2024)Similarity Transitivity Broken-Aware Multi-Modal HashingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339649236:11(7003-7014)Online publication date: Nov-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media