[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2324796.2324816acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

Published: 05 June 2012 Publication History

Abstract

Bag-of-words models are among the most widely used and successful representations in multimedia retrieval. However, the quantization error which is introduced when mapping keypoints to visual words is one of the main drawbacks of the bag-of-words model. Although some techniques, such as soft-assignment to bags [23] and query expansion [27], have been introduced to deal with the problem, the performance gain is always at the cost of longer query response time, which makes them difficult to apply to large-scale multimedia retrieval applications. In this paper, we propose a simple "constrained keypoint quantization" method which can effectively reduce the overall quantization error of the bag-of-words representation and greatly improve the retrieval efficiency at the same time. The central idea of the proposed quantization method is that if a keypoint is far away from all visual words, we simply remove it. At first glance, this simple strategy seems naive and dangerous. However, we show that the proposed method has a solid theoretical background. Our experimental results on three widely used datasets for near duplicate image and video retrieval confirm that by removing a large amount of keypoints which have high quantization error, we obtain comparable or even better retrieval performance while dramatically boosting retrieval efficiency.

References

[1]
cc_web_video: Near-duplicate web video dataset. available: http://vireo.cs.cityu.edu.hk/webvideo/.
[2]
http://www.flickr.com.
[3]
http://www.robots.ox.ac.uk/~vgg/data/oxbuildings.
[4]
http://www.robots.ox.ac.uk/~vgg/research/affine.
[5]
R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. ACM Press, 1999.
[6]
O. Boiman, E. Shechtman, and M. Irani. In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition, 2008.
[7]
S. Boughhorbel, J.-P. Tarel, and F. Fleuret. Non-mercer kernels for svm object recognition. In British Machine Vision Conference, 2004.
[8]
Y. Cai, L. Yang, W. Ping, F. Wang, T. Mei, X.-S. Hua, and S. Li. Million-scale near-duplicate video retrieval system. In ACM Multimedia, 2011.
[9]
G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. Workshop on Statistical Learning in Computer Vision, 2004.
[10]
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2000.
[11]
K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. Journal of Machine Learning Research, 2007.
[12]
W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 1963.
[13]
H. Jégou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. International Journal of Computer Vision, 2010.
[14]
F. Jurie and B. Triggs. Creating efficient codebooks for visual recognition. In Computer Vision and Pattern Recognition, 2005.
[15]
Y. Ke, R. Sukthankar, and L. Huston. Efficient near-duplicate detection and sub-image retrieval. In ACM Multimedia, 2004.
[16]
D. Li, L. Yang, X.-S. Hua, and H.-J. Zhang. Large-scale robust visual codebook construction. In ACM Multimedia, 2010.
[17]
F. Li, W. Tong, R. Jin, A. K. Jain, and J.-E. Lee. An efficient key point quantization algorithm for large scale image retrieval. In ACM workshop on Large-scale multimedia retrieval and mining, 2009.
[18]
D. Lowe. Distinctive image features from scale-invariant keypoints. In International Journal of Computer Vision, 2004.
[19]
S. Lyu. Mercer kernels for object recognition with local features. In Computer Vision and Pattern Recognition, 2005.
[20]
M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Application (VISSAPP'09), 2009.
[21]
D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Computer Vision and Pattern Recognition, 2006.
[22]
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007.
[23]
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In Computer Vision and Pattern Recognition, 2008.
[24]
J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision, 2003.
[25]
T. Tuytelaars and C. Schmid. Vector quantizing feature space with a regular lattice. In International Conference on Computer Vision, 2007.
[26]
X. Wu, A. G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM Multimedia, 2007.
[27]
L. Yang, Y. Cai, A. Hanjalic, X.-S. Hua, and S. Li. Video-based image retrieval. In ACM Multimedia, 2011.
[28]
L. Yang, B. Geng, Y. Cai, A. Hanjalic, and X.-S. Hua. Object retrieval using visual query context. IEEE Transactions on Multimedia, 2011.
[29]
Y. Yang, F. Nie, D. Xu, J. Luo, Y. Zhuang, and Y. Pan. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012.
[30]
Y. Yang, Y.-T. Zhuang, F. Wu, and Y.-H. Pan. Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Transactions on Multimedia, 2008.
[31]
W.-L. Zhao, S. Tan, and C.-W. Ngo. Large-scale near-duplicate web video search: challenge and opportunity. In International Conference on Multimedia and Expo, 2009.

Cited By

View all
  • (2019)Multiscale video sequence matching for near-duplicate detection and retrievalMultimedia Tools and Applications10.1007/s11042-018-5862-378:1(311-336)Online publication date: 1-Jan-2019
  • (2018)SIFT Meets CNN: A Decade Survey of Instance RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.270974940:5(1224-1244)Online publication date: 1-May-2018
  • (2016)Regional Subspace Projection Coding for Image RetrievalProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912003(205-212)Online publication date: 6-Jun-2016
  • Show More Cited By

Index Terms

  1. Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '12: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
    June 2012
    489 pages
    ISBN:9781450313292
    DOI:10.1145/2324796
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 June 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bag-of-words model
    2. multimedia retrieval
    3. visual word quantization

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICMR '12
    Sponsor:

    Acceptance Rates

    ICMR '12 Paper Acceptance Rate 50 of 145 submissions, 34%;
    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Multiscale video sequence matching for near-duplicate detection and retrievalMultimedia Tools and Applications10.1007/s11042-018-5862-378:1(311-336)Online publication date: 1-Jan-2019
    • (2018)SIFT Meets CNN: A Decade Survey of Instance RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.270974940:5(1224-1244)Online publication date: 1-May-2018
    • (2016)Regional Subspace Projection Coding for Image RetrievalProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912003(205-212)Online publication date: 6-Jun-2016
    • (2015)TASCACM Transactions on Information Systems10.1145/269966233:2(1-34)Online publication date: 17-Feb-2015
    • (2015)Fast Image Retrieval: Query Pruning and Early TerminationIEEE Transactions on Multimedia10.1109/TMM.2015.240856317:5(648-659)Online publication date: 1-May-2015
    • (2015)Tensor index for large scale image retrievalMultimedia Systems10.1007/s00530-014-0415-821:6(569-579)Online publication date: 1-Nov-2015
    • (2014) \(\mathcal {L}_p\) -Norm IDF for Scalable Image RetrievalIEEE Transactions on Image Processing10.1109/TIP.2014.232918223:8(3604-3617)Online publication date: Aug-2014
    • (2013)Multimodal late fusion bag of features applied to scene detectionProceedings of the 19th Brazilian symposium on Multimedia and the web10.1145/2526188.2526202(15-22)Online publication date: 5-Nov-2013
    • (2013)Lp-Norm IDF for Large Scale Image SearchProceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2013.213(1626-1633)Online publication date: 23-Jun-2013

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media