More Web Proxy on the site http://driver.im/

research-article

Semi-supervised Distance Consistent Cross-modal Retrieval

Authors:

Huaxiang ZhangAuthors Info & Claims

VSCC '17: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities

Pages 25 - 31

https://doi.org/10.1145/3132734.3132735

Published: 23 October 2017 Publication History

Abstract

Most of existing cross-modal retrieval approaches only exploit labeled data to train coupled projection matrices for supporting retrieval tasks across heterogeneous modalities. However, the valuable information involved in unlabeled data is unfortunately ignored. In this paper, we propose a novel Semi-Supervised Distance Consistent method (SSDC) to solve the problem. Our approach firstly models the initial correlation between different modalities by constructing the pseudo label and corresponding data of unlabeled query. Then our method learns projection matrices by adaptively optimizing the pseudo label of unlabeled data. In this way, SSDC could learn discriminative projection matrices. Experimental results on two publicly available datasets demonstrate the superior performance of the proposed approach.

References

[1]

Chia-Hung Wei and Chang-Tsun Li. Content-based multimedia retrieval. Proc. Encyclopedia of Multimedia Technology and Networking, pages 116--122, 2005.

[2]

Hugo Jair Escalante, Carlos A Hérnadez, Luis Enrique Sucar, and Manuel Montes. Late fusion of heterogeneous methods for multimedia image retrieval. In the 1st ACM international conference on Multimedia information retrieval, pages 172--179. ACM, 2008.

Digital Library

[3]

Jiwoon Jeon, Victor Lavrenko, and Raghavan Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 119--126. ACM, 2003.

Digital Library

[4]

Marc Ehrig. Ontology alignment: bridging the semantic gap, volume 4. Springer Science & Business Media, 2006.

Digital Library

[5]

Fei Wu, Hong Zhang, and Yueting Zhuang. Learning semantic correlations for cross-media retrieval. In Image Processing, 2006 IEEE International Conference on, pages 1465--1468. IEEE, 2006.

[6]

Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Coviello, Gabriel Doyle, Gert R. G. Lanckriet, Roger Levy, and Nuno Vasconcelos. A new approach to cross-modal multimedia retrieval. In Proceedings of the 18th ACM international conference on Multimedia, pages 251--260. ACM, 2010.

Digital Library

[7]

David R. Hardoon, Sandor Szedmak, and John Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural computation, 16(12):2639--2664, 2004.

Digital Library

[8]

Yangqing Jia, Mathieu Salzmann, and Trevor Darrell. Learning cross-modality similarity for multinomial data. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 2407--2414. IEEE, 2011.

Digital Library

[9]

Cuicui Kang, Shiming Xiang, Shengcai Liao, Changsheng Xu, and Chunhong Pan. Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Transactions on Multimedia, 17(3):370--381, 2015.

Digital Library

[10]

David W. Hosmer Jr, Stanley Lemeshow, and Rodney X. Sturdivant. Applied logistic regression, volume 398. John Wiley & Sons, 2013.

[11]

Jerome Friedman, Trevor Hastie, Robert Tibshirani, et al. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics, 28(2):337--407, 2000.

[12]

Jianfeng He, Bingpeng Ma, Shuhui Wang, Yugui Liu, and Qingming Huang. Cross-modal retrieval by real label partial least squares. In 2016 ACM on Multimedia Conference, pages 227--231. ACM, 2016.

Digital Library

[13]

Abhishek Sharma and David W. Jacobs. Bypassing synthesis: Pls for face recognition with pose, low-resolution and sketch. In Computer Vision and Pattern Recognition, 2011. CVPR 2009. IEEE Conference on, pages 593--600. IEEE, 2011.

Digital Library

[14]

Beiying Ding and Robert Gentleman. Classification using generalized partial least squares. Journal of Computational and Graphical Statistics, 14(2):280--298, 2005.

[15]

Murad Al Haj, Jordi Gonzalez, and Larry S. Davis. On partial least squares in head pose estimation: How to simultaneously deal with misalignment. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2602--2609. IEEE, 2012.

Digital Library

[16]

Roman Rosipal and Nicole Krämer. Overview and recent advances in partial least squares. Lecture notes in computer science, 3940:34, 2006.

Digital Library

[17]

Roman Rosipal and Leonard J. Trejo. Kernel partial least squares regression in reproducing kernel hilbert space. Journal of machine learning research, 2(Dec):97--123, 2001.

Digital Library

[18]

Abhishek Sharma, Abhishek Kumar, Hal Daume, and David W. Jacobs. Generalized multiview analysis: A discriminative latent space. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2160--2167. IEEE, 2012.

Digital Library

[19]

Meina Kan, Shiguang Shan, Haihong Zhang, Shihong Lao, and Xilin Chen. Multi-view discriminant analysis. IEEE transactions on pattern analysis and machine intelligence, 38(1):188--194, 2016.

Digital Library

[20]

Kamalika Chaudhuri, Sham M. Kakade, Karen Livescu, and Karthik Sridharan. Multi-view clustering via canonical correlation analysis. In the 26th annual international conference on machine learning, pages 129--136. ACM, 2009.

Digital Library

[21]

Lei Zhu, Jialie She, Xiaobai Liu, Liang Xie, and Liqiang Nie. Learning compact visual representation with canonical views for robust mobile landmark search. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 3959--3965, 2016.

Digital Library

[22]

L. Zhu, Z. Huang, X. Liu, X. He, J. Sun, and X. Zhou. Discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE Transactions on Multimedia, 2017.

[23]

Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, and Shuicheng Yan. Modality-dependent cross-media retrieval. ACM Transactions on Intelligent Systems and Technology (TIST), 7(4):57, 2016.

Digital Library

[24]

Yunchao Gong, Qifa Ke, Michael Isard, and Svetlana Lazebnik. A multi-view embedding space for modeling internet images, tags, and their semantics. International Journal of Computer Vision, 106(2):210--233, 2012.

Digital Library

[25]

Yan Ke and Rahul Sukthankar. Pca-sift: a more distinctive representation for local image descriptors. In Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, pages II-506--II-513 Vol.2, 2004.

Digital Library

[26]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. JMLR.org, 2003.

Digital Library

[27]

Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. Collecting image annotations using amazon's mechanical turk. In the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, pages 139--147. Association for Computational Linguistics, 2010.

Digital Library

[28]

Liang Zheng, Yali Zhao, Shengjin Wang, Jingdong Wang, and Qi Tian. Good practice in cnn feature transfer. arXiv preprint arXiv:1604.00133, 2016.

[29]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In the IEEE Conference on Computer Vision and Pattern Recognition, pages 2818--2826, 2016.

[30]

Kazuaki Kishida. Property of mean average precision as performance measure in retrieval experiment. Ipsj Sig Notes, 2001:97--104, 2001.

Cited By

Luo XJu WGu YQin YYi SWu DLiu LZhang M(2023)Toward Effective Semi-supervised Node Classification with Hybrid Curriculum Pseudo-labelingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362652820:3(1-19)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3626528
Xu GLi XZhang Z(2020)Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph RegularizationIEEE Access10.1109/ACCESS.2020.29662208(14278-14288)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.2966220

Index Terms

Semi-supervised Distance Consistent Cross-modal Retrieval
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval

Recommendations

Cross-modal Retrieval with Label Completion
MM '16: Proceedings of the 24th ACM international conference on Multimedia

Cross-modal retrieval has been attracting increasing attention because of the explosion of multi-modal data, e.g., texts and images. Most supervised cross-modal retrieval methods learn discriminant common subspaces minimizing the heterogeneity of ...
Learning pseudo labels for semi-and-weakly supervised semantic segmentation
Highlights
- We improve the semi-and-weakly supervised semantic segmentation via learning high-quality pseudo labels.
Abstract
In this paper, we aim to tackle semi-and-weakly supervised semantic segmentation (SWSSS), where many image-level classification labels and a few pixel-level annotations are available. We believe the most crucial point for solving SWSSS ...
Semi-supervised discrete hashing for efficient cross-modal retrieval
Abstract
Cross-modal hashing has recently gained significant popularity to facilitate multimedia retrieval across different modalities. Since the acquisition of large-scale labeled training data are very labor intensive, most supervised cross-modal hashing ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

VSCC '17: Proceedings of the Workshop on Visual Analysis in Smart and Connected Communities

October 2017

58 pages

ISBN:9781450355063

DOI:10.1145/3132734

Program Chairs:
Xiaobai Liu
San Diego State University, USA
,
Yadong Mu
Peking University, China
,
Yu-Gang Jiang
Fudan University, China
,
Jiebo Luo
University of Rochester, USA

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Science Foundation of Shandong China
National Natural Science Foundation of China
Key Research and Development Foundation of Shandong Province

Conference

MM '17

Sponsor:

SIGMM

MM '17: ACM Multimedia Conference

October 23, 2017

California, Mountain View, USA

Acceptance Rates

VSCC '17 Paper Acceptance Rate 6 of 12 submissions, 50%;

Overall Acceptance Rate 6 of 12 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
145
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Luo XJu WGu YQin YYi SWu DLiu LZhang M(2023)Toward Effective Semi-supervised Node Classification with Hybrid Curriculum Pseudo-labelingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362652820:3(1-19)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3626528
Xu GLi XZhang Z(2020)Semantic Consistency Cross-Modal Retrieval With Semi-Supervised Graph RegularizationIEEE Access10.1109/ACCESS.2020.29662208(14278-14288)Online publication date: 2020
https://doi.org/10.1109/ACCESS.2020.2966220

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents