[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3132734.3132735acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Semi-supervised Distance Consistent Cross-modal Retrieval

Published: 23 October 2017 Publication History

Abstract

Most of existing cross-modal retrieval approaches only exploit labeled data to train coupled projection matrices for supporting retrieval tasks across heterogeneous modalities. However, the valuable information involved in unlabeled data is unfortunately ignored. In this paper, we propose a novel Semi-Supervised Distance Consistent method (SSDC) to solve the problem. Our approach firstly models the initial correlation between different modalities by constructing the pseudo label and corresponding data of unlabeled query. Then our method learns projection matrices by adaptively optimizing the pseudo label of unlabeled data. In this way, SSDC could learn discriminative projection matrices. Experimental results on two publicly available datasets demonstrate the superior performance of the proposed approach.

References

[1]
Chia-Hung Wei and Chang-Tsun Li. Content-based multimedia retrieval. Proc. Encyclopedia of Multimedia Technology and Networking, pages 116--122, 2005.
[2]
Hugo Jair Escalante, Carlos A Hérnadez, Luis Enrique Sucar, and Manuel Montes. Late fusion of heterogeneous methods for multimedia image retrieval. In the 1st ACM international conference on Multimedia information retrieval, pages 172--179. ACM, 2008.
[3]
Jiwoon Jeon, Victor Lavrenko, and Raghavan Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 119--126. ACM, 2003.
[4]
Marc Ehrig. Ontology alignment: bridging the semantic gap, volume 4. Springer Science & Business Media, 2006.
[5]
Fei Wu, Hong Zhang, and Yueting Zhuang. Learning semantic correlations for cross-media retrieval. In Image Processing, 2006 IEEE International Conference on, pages 1465--1468. IEEE, 2006.
[6]
Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert R. G. Lanckriet, Roger Levy, and Nuno Vasconcelos. A new approach to cross-modal multimedia retrieval. In Proceedings of the 18th ACM international conference on Multimedia, pages 251--260. ACM, 2010.
[7]
David R. Hardoon, Sandor Szedmak, and John Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural computation, 16(12):2639--2664, 2004.
[8]
Yangqing Jia, Mathieu Salzmann, and Trevor Darrell. Learning cross-modality similarity for multinomial data. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 2407--2414. IEEE, 2011.
[9]
Cuicui Kang, Shiming Xiang, Shengcai Liao, Changsheng Xu, and Chunhong Pan. Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Transactions on Multimedia, 17(3):370--381, 2015.
[10]
David W. Hosmer Jr, Stanley Lemeshow, and Rodney X. Sturdivant. Applied logistic regression, volume 398. John Wiley & Sons, 2013.
[11]
Jerome Friedman, Trevor Hastie, Robert Tibshirani, et al. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics, 28(2):337--407, 2000.
[12]
Jianfeng He, Bingpeng Ma, Shuhui Wang, Yugui Liu, and Qingming Huang. Cross-modal retrieval by real label partial least squares. In 2016 ACM on Multimedia Conference, pages 227--231. ACM, 2016.
[13]
Abhishek Sharma and David W. Jacobs. Bypassing synthesis: Pls for face recognition with pose, low-resolution and sketch. In Computer Vision and Pattern Recognition, 2011. CVPR 2009. IEEE Conference on, pages 593--600. IEEE, 2011.
[14]
Beiying Ding and Robert Gentleman. Classification using generalized partial least squares. Journal of Computational and Graphical Statistics, 14(2):280--298, 2005.
[15]
Murad Al Haj, Jordi Gonzalez, and Larry S. Davis. On partial least squares in head pose estimation: How to simultaneously deal with misalignment. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2602--2609. IEEE, 2012.
[16]
Roman Rosipal and Nicole Krämer. Overview and recent advances in partial least squares. Lecture notes in computer science, 3940:34, 2006.
[17]
Roman Rosipal and Leonard J. Trejo. Kernel partial least squares regression in reproducing kernel hilbert space. Journal of machine learning research, 2(Dec):97--123, 2001.
[18]
Abhishek Sharma, Abhishek Kumar, Hal Daume, and David W. Jacobs. Generalized multiview analysis: A discriminative latent space. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2160--2167. IEEE, 2012.
[19]
Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. Multi-view discriminant analysis. IEEE transactions on pattern analysis and machine intelligence, 38(1):188--194, 2016.
[20]
Kamalika Chaudhuri, Sham M. Kakade, Karen Livescu, and Karthik Sridharan. Multi-view clustering via canonical correlation analysis. In the 26th annual international conference on machine learning, pages 129--136. ACM, 2009.
[21]
Lei Zhu, Jialie She, Xiaobai Liu, Liang Xie, and Liqiang Nie. Learning compact visual representation with canonical views for robust mobile landmark search. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 3959--3965, 2016.
[22]
L. Zhu, Z. Huang, X. Liu, X. He, J. Sun, and X. Zhou. Discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE Transactions on Multimedia, 2017.
[23]
Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, and Shuicheng Yan. Modality-dependent cross-media retrieval. ACM Transactions on Intelligent Systems and Technology (TIST), 7(4):57, 2016.
[24]
Yunchao Gong, Qifa Ke, Michael Isard, and Svetlana Lazebnik. A multi-view embedding space for modeling internet images, tags, and their semantics. International Journal of Computer Vision, 106(2):210--233, 2012.
[25]
Yan Ke and Rahul Sukthankar. Pca-sift: a more distinctive representation for local image descriptors. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, pages II-506--II-513 Vol.2, 2004.
[26]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. JMLR.org, 2003.
[27]
Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. Collecting image annotations using amazon's mechanical turk. In the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, pages 139--147. Association for Computational Linguistics, 2010.
[28]
Liang Zheng, Yali Zhao, Shengjin Wang, Jingdong Wang, and Qi Tian. Good practice in cnn feature transfer. arXiv preprint arXiv:1604.00133, 2016.
[29]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In the IEEE Conference on Computer Vision and Pattern Recognition, pages 2818--2826, 2016.
[30]
Kazuaki Kishida. Property of mean average precision as performance measure in retrieval experiment. Ipsj Sig Notes, 2001:97--104, 2001.

Cited By

View all
  • (2023)Toward Effective Semi-supervised Node Classification with Hybrid Curriculum Pseudo-labelingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362652820:3(1-19)Online publication date: 4-Oct-2023
  • (2020)Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph RegularizationIEEE Access10.1109/ACCESS.2020.29662208(14278-14288)Online publication date: 2020

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
VSCC '17: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities
October 2017
58 pages
ISBN:9781450355063
DOI:10.1145/3132734
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cross-modal retrieval
  2. pseudo label
  3. semi-supervised

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Foundation of Shandong China
  • National Natural Science Foundation of China
  • Key Research and Development Foundation of Shandong Province

Conference

MM '17
Sponsor:
MM '17: ACM Multimedia Conference
October 23, 2017
California, Mountain View, USA

Acceptance Rates

VSCC '17 Paper Acceptance Rate 6 of 12 submissions, 50%;
Overall Acceptance Rate 6 of 12 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Toward Effective Semi-supervised Node Classification with Hybrid Curriculum Pseudo-labelingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362652820:3(1-19)Online publication date: 4-Oct-2023
  • (2020)Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph RegularizationIEEE Access10.1109/ACCESS.2020.29662208(14278-14288)Online publication date: 2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media