[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2733373.2806340acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

SHOE: Sibling Hashing with Output Embeddings

Published: 13 October 2015 Publication History

Abstract

We present a supervised binary encoding scheme for image retrieval that learns projections by taking into account similarity between classes obtained from output embeddings. Our motivation is that binary hash codes learned in this way improve the visual quality of retrieval results by ranking related (or ``sibling'') class images before unrelated class images. We employ a sequential greedy optimization that learns relationship aware projections by minimizing the difference between inner products of binary codes and output embedding vectors. We develop a joint optimization framework to learn projections which improve the accuracy of supervised hashing over the current state of the art with respect to standard and sibling evaluation metrics. We further obtain discriminative features learned from correlations of kernelized input CNN features and output embeddings, which significantly boosts performance. Experiments are performed on three datasets: CUB-2011, SUN-Attribute and ImageNet ILSVRC 2010, where we show significant improvement in sibling performance metrics over state-of-the-art supervised hashing techniques, while maintaining performance with respect to standard metrics.

References

[1]
M. S. Charikar. Similarity estimation techniques from rounding algorithms. STOC, 2002.
[2]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR, 2009.
[3]
Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, 2011.
[4]
D. Hsu, S. Kakade, J. Langford, and T. Zhang. Multi-label prediction via compressed sensing. In NIPS, 2009.
[5]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
[6]
C. H. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009.
[7]
G. Lin, C. Shen, Q. Shi, A. van den Hengel, and D. Suter. Fast supervised hashing with decision trees for high-dimensional data. In CVPR, 2014.
[8]
W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. Supervised hashing with kernels. In CVPR, 2012.
[9]
G. Patterson, C. Xu, H. Su, and J. Hays. The sun attribute database: Beyond categories for deeper scene understanding. ICCV, 2014.
[10]
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, Dec. 2005.
[11]
K. Q. Weinberger and O. Chapelle. Large margin taxonomy embedding for document categorization. In NIPS. 2009.
[12]
P. Welinder, S. Branson, T. Mita, C. Wah, F. Schroff, S. Belongie, and P. Perona. Caltech-UCSD Birds 200. Technical Report CNS-TR-2010-001, 2010.

Cited By

View all
  • (2018)BlackthornIEEE Transactions on Multimedia10.1109/TMM.2017.275598620:3(687-698)Online publication date: 1-Mar-2018
  • (2016)Interactive Multimodal Learning on 100 Million ImagesProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912062(333-337)Online publication date: 6-Jun-2016

Index Terms

  1. SHOE: Sibling Hashing with Output Embeddings

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '15: Proceedings of the 23rd ACM international conference on Multimedia
    October 2015
    1402 pages
    ISBN:9781450334594
    DOI:10.1145/2733373
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 October 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. output embeddings
    2. sibling hashing
    3. supervised hashing

    Qualifiers

    • Short-paper

    Funding Sources

    • EAGER: Scalable Video Retrieval

    Conference

    MM '15
    Sponsor:
    MM '15: ACM Multimedia Conference
    October 26 - 30, 2015
    Brisbane, Australia

    Acceptance Rates

    MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)BlackthornIEEE Transactions on Multimedia10.1109/TMM.2017.275598620:3(687-698)Online publication date: 1-Mar-2018
    • (2016)Interactive Multimodal Learning on 100 Million ImagesProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912062(333-337)Online publication date: 6-Jun-2016

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media