[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3123266.3123392acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Efficient Binary Coding for Subspace-based Query-by-Image Video Retrieval

Published: 19 October 2017 Publication History

Abstract

Subspace representations have been widely applied for videos in many tasks. In particular, the subspace-based query-by-image video retrieval (QBIVR), facing high challenges on similarity-preserving measurements and efficient retrieval schemes, urgently needs considerable research attention. In this paper, we propose a novel subspace-based QBIVR framework to enable efficient video search. We first define a new geometry-preserving distance metric to measure the image-to-video distance, which transforms the QBIVR task to be the Maximum Inner Product Search (MIPS) problem. The merit of this distance metric lies in that it helps to preserve the genuine geometric relationship between query images and database videos to the greatest extent. To boost the efficiency of solving the MIPS problem, we introduce two asymmetric hashing schemes which can bridge the domain gap of images and videos properly. The first approach, termed Inner-product Binary Coding (IBC), achieves high-quality binary codes by learning the binary codes and coding functions simultaneously without continuous relaxations. The other one, Bilinear Binary Coding (BBC) approach, employs compact bilinear projections instead of a single large projection matrix to further improve the retrieval efficiency. Extensive experiments on four real-world video datasets verify the effectiveness of our proposed approaches, as compared to the state-of-the-art methods.

References

[1]
Ronen Basri, Tal Hassner, and Lihi Zelnik-Manor. 2011. Approximate Nearest Subspace Search. TPAMI, Vol. 33, 2 (2011), 266--278.
[2]
Martin Bäuml, Makarand Tapaswi, and Rainer Stiefelhagen. 2013. Semi-supervised Learning with Constraints for Person Identification in Multimedia Data CVPR. 3602--3609.
[3]
Andre F. de Araújo, Jason Chaves, Roland Angst, and Bernd Girod. 2015. Temporal aggregation for large-scale query-by-image video retrieval ICIP. 1519--1522.
[4]
Andre F. de Araújo, Mina Makar, Vijay Chandrasekhar, David M. Chen, Sam S. Tsai, Huizhong Chen, Roland Angst, and Bernd Girod. 2014. Efficient video search using image queries. In ICIP. 3082--3086.
[5]
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective Matrix Factorization Hashing for Multimodal Data CVPR. 2083--2090.
[6]
Yunchao Gong, Sanjiv Kumar, Henry A. Rowley, and Svetlana Lazebnik. 2013 a. Learning Binary Codes for High-Dimensional Data Using Bilinear Projections CVPR. 484--491.
[7]
Yunchao Gong, Sanjiv Kumar, Vishal Verma, and Svetlana Lazebnik. 2012. Angular Quantization-based Binary Codes for Fast Similarity Search NIPS. 1205--1213.
[8]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013 b. Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval. TPAMI, Vol. 35, 12 (2013), 2916--2929.
[9]
Richang Hong, Yang Yang, Meng Wang, and Xian-Sheng Hua. 2015. Learning Visual Semantic Relationships for Efficient Visual Retrieval. TBD, Vol. 1, 4 (2015), 152--161.
[10]
Yiqun Hu, Ajmal S. Mian, and Robyn A. Owens. 2011. Sparse approximated nearest points for image set classification CVPR. 121--128.
[11]
Zi Huang, Heng Tao Shen, Jie Shao, Xiaofang Zhou, and Bin Cui. 2009. Bounded coordinate system indexing for real-time video clip search. TOIS, Vol. 27, 3 (2009), 17:1--17:33.
[12]
Jianqiu Ji, Jianmin Li, Qi Tian, Shuicheng Yan, and Bo Zhang. 2015. Angular-Similarity-Preserving Binary Signatures for Linear Subspaces. TIP, Vol. 24, 11 (2015), 4372--4380.
[13]
Jianqiu Ji, Jianmin Li, Shuicheng Yan, Qi Tian, and Bo Zhang. 2014. Similarity-Preserving Binary Signature for Linear Subspaces AAAI. 2767--2772.
[14]
Qing-Yuan Jiang and Wu-Jun Li. 2015. Scalable Graph Hashing with Feature Transformation IJCAI. 2248--2254.
[15]
Yu-Gang Jiang, Zuxuan Wu, Jun Wang, Xiangyang Xue, and Shih-Fu Chang. 2015. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks. CoRR Vol. abs/1502.07209 (2015).
[16]
Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks CVPR. 3270--3278.
[17]
Alan J. Laub. 2005. Matrix analysis - for scientists and engineers. SIAM.
[18]
Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature Learning Based Deep Supervised Hashing with Pairwise Labels IJCAI. 1711--1717.
[19]
Yan Li, Ruiping Wang, Zhiwu Huang, Shiguang Shan, and Xilin Chen. 2015. Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold CVPR. 4758--4767.
[20]
Zijia Lin, Guiguang Ding, Mingqing Hu, and Jianmin Wang. 2015. Semantics-preserving hashing for cross-view retrieval CVPR. 3864--3872.
[21]
Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou. 2015. Deep hashing for compact binary codes learning. In CVPR. 2475--2483.
[22]
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In CVPR. 2074--2081.
[23]
Wei Liu, Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2011. Hashing with Graphs ICML. 1--8.
[24]
Mohammad Norouzi and David J. Fleet. 2011. Minimal Loss Hashing for Compact Binary Codes. In ICML. 353--360.
[25]
Florent Perronnin, Jorge Sánchez, and Thomas Mensink. 2010. Improving the Fisher Kernel for Large-Scale Image Classification ECCV. 143--156.
[26]
Mohammad Rastegari, Jonghyun Choi, Shobeir Fakhraei, Hal Daumé III, and Larry S. Davis. 2013. Predictable Dual-View Hashing. In ICML. 1328--1336.
[27]
Fumin Shen, Wei Liu, Shaoting Zhang, Yang Yang, and Heng Tao Shen. 2015. Learning Binary Codes for Maximum Inner Product Search ICCV. 4148--4156.
[28]
Fumin Shen, Xiang Zhou, Yang Yang, Jingkuan Song, Heng Tao Shen, and Dacheng Tao. 2016. A Fast Optimization Method for General Binary Code Learning. TIP, Vol. 25, 12 (2016), 5610--5621.
[29]
Anshumali Shrivastava and Ping Li. 2014. Asymmetric LSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS) NIPS. 2321--2329.
[30]
Anshumali Shrivastava and Ping Li. 2015. Improved Asymmetric Locality Sensitive Hashing (ALSH) for Maximum Inner Product Search (MIPS). In UAI. 812--821.
[31]
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources SIGMOD. 785--796.
[32]
Raviteja Vemulapalli, Jaishanker K. Pillai, and Rama Chellappa. 2013. Kernel Learning for Extrinsic Classification of Manifold Features CVPR. 1782--1789.
[33]
Ruiping Wang and Xilin Chen. 2009. Manifold Discriminant Analysis. In CVPR. 429--436.
[34]
Zhongwen Xu, Yi Yang, and Alexander G. Hauptmann. 2015. A discriminative CNN video representation for event detection CVPR. 1798--1807.
[35]
Yang Yang, Yadan Luo, Weilun Chen, Fumin Shen, Jie Shao, and Heng Tao Shen. 2016. Zero-Shot Hashing via Transferring Supervised Knowledge ACM MM. 1286--1295.
[36]
Yang Yang, Fumin Shen, Heng Tao Shen, Hanxi Li, and Xuelong Li. 2015. Robust Discrete Spectral Hashing for Large-Scale Image Semantic Indexing. TBD, Vol. 1, 4 (2015), 162--171.
[37]
Yang Yang, Zheng-Jun Zha, Yue Gao, Xiaofeng Zhu, and Tat-Seng Chua. 2014. Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss. TMM, Vol. 16, 6 (2014), 1677--1689.
[38]
Litao Yu, Yang Yang, Zi Huang, Peng Wang, Jingkuan Song, and Heng Tao Shen. 2016. Web Video Event Recognition by Semantic Analysis from Ubiquitous Documents. TIP, Vol. 25, 12, 5689--5701.
[39]
Dongqing Zhang and Wu-Jun Li. 2014. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization AAAI. 2177--2183.
[40]
Yi Zhen and Dit-Yan Yeung. 2012. A probabilistic model for multimodal hash function learning SIGKDD. 940--948.
[41]
Jile Zhou, Guiguang Ding, and Yuchen Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search SIGIR. 415--424.
[42]
Xiaofeng Zhu, Zi Huang, Heng Tao Shen, and Xin Zhao. 2013. Linear cross-modal hashing for efficient multimedia search ACM MM. 143--152.

Cited By

View all
  • (2024)Bridging asymmetry between image and video: Cross-modality knowledge transfer based on learning from videoExpert Systems with Applications10.1016/j.eswa.2024.125873(125873)Online publication date: Nov-2024
  • (2022)Activity Image-to-Video Retrieval via Domain Adversarial Learning2022 34th Chinese Control and Decision Conference (CCDC)10.1109/CCDC55256.2022.10034352(6183-6188)Online publication date: 15-Aug-2022
  • (2022)Transformed Deep Spatio Temporal-Features with Fused Distance for Efficient Video Retrieval2022 4th International Conference on Artificial Intelligence and Speech Technology (AIST)10.1109/AIST55798.2022.10064821(1-5)Online publication date: 9-Dec-2022
  • Show More Cited By

Index Terms

  1. Efficient Binary Coding for Subspace-based Query-by-Image Video Retrieval

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '17: Proceedings of the 25th ACM international conference on Multimedia
    October 2017
    2028 pages
    ISBN:9781450349062
    DOI:10.1145/3123266
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 October 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. asymmetric hashing
    2. geometry-preserving distance metric
    3. query-by-image
    4. video retrieval

    Qualifiers

    • Research-article

    Conference

    MM '17
    Sponsor:
    MM '17: ACM Multimedia Conference
    October 23 - 27, 2017
    California, Mountain View, USA

    Acceptance Rates

    MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Bridging asymmetry between image and video: Cross-modality knowledge transfer based on learning from videoExpert Systems with Applications10.1016/j.eswa.2024.125873(125873)Online publication date: Nov-2024
    • (2022)Activity Image-to-Video Retrieval via Domain Adversarial Learning2022 34th Chinese Control and Decision Conference (CCDC)10.1109/CCDC55256.2022.10034352(6183-6188)Online publication date: 15-Aug-2022
    • (2022)Transformed Deep Spatio Temporal-Features with Fused Distance for Efficient Video Retrieval2022 4th International Conference on Artificial Intelligence and Speech Technology (AIST)10.1109/AIST55798.2022.10064821(1-5)Online publication date: 9-Dec-2022
    • (2022)Asymmetric hashing based on generative adversarial networkMultimedia Tools and Applications10.1007/s11042-022-13141-282:1(389-405)Online publication date: 7-Jun-2022
    • (2021)Collaborative Learning for Extremely Low Bit Asymmetric HashingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.297763333:12(3675-3685)Online publication date: 1-Dec-2021
    • (2020)Deep Heterogeneous Hashing for Face Video RetrievalIEEE Transactions on Image Processing10.1109/TIP.2019.294068329(1299-1312)Online publication date: 2020
    • (2018)Binary Coding by Matrix Classifier for Efficient Subspace RetrievalProceedings of the 2018 ACM on International Conference on Multimedia Retrieval10.1145/3206025.3206058(82-90)Online publication date: 5-Jun-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media