More Web Proxy on the site http://driver.im/

research-article

Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval

Authors:

Alexander G. HauptmannAuthors Info & Claims

ICMR '12: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval

Article No.: 16, Pages 1 - 8

https://doi.org/10.1145/2324796.2324816

Published: 05 June 2012 Publication History

Abstract

Bag-of-words models are among the most widely used and successful representations in multimedia retrieval. However, the quantization error which is introduced when mapping keypoints to visual words is one of the main drawbacks of the bag-of-words model. Although some techniques, such as soft-assignment to bags [23] and query expansion [27], have been introduced to deal with the problem, the performance gain is always at the cost of longer query response time, which makes them difficult to apply to large-scale multimedia retrieval applications. In this paper, we propose a simple "constrained keypoint quantization" method which can effectively reduce the overall quantization error of the bag-of-words representation and greatly improve the retrieval efficiency at the same time. The central idea of the proposed quantization method is that if a keypoint is far away from all visual words, we simply remove it. At first glance, this simple strategy seems naive and dangerous. However, we show that the proposed method has a solid theoretical background. Our experimental results on three widely used datasets for near duplicate image and video retrieval confirm that by removing a large amount of keypoints which have high quantization error, we obtain comparable or even better retrieval performance while dramatically boosting retrieval efficiency.

References

[1]

cc_web_video: Near-duplicate web video dataset. available: http://vireo.cs.cityu.edu.hk/webvideo/.

[2]

http://www.flickr.com.

[3]

http://www.robots.ox.ac.uk/~vgg/data/oxbuildings.

[4]

http://www.robots.ox.ac.uk/~vgg/research/affine.

[5]

R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. ACM Press, 1999.

Digital Library

[6]

O. Boiman, E. Shechtman, and M. Irani. In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition, 2008.

[7]

S. Boughhorbel, J.-P. Tarel, and F. Fleuret. Non-mercer kernels for svm object recognition. In British Machine Vision Conference, 2004.

[8]

Y. Cai, L. Yang, W. Ping, F. Wang, T. Mei, X.-S. Hua, and S. Li. Million-scale near-duplicate video retrieval system. In ACM Multimedia, 2011.

Digital Library

[9]

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. Workshop on Statistical Learning in Computer Vision, 2004.

[10]

R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2000.

Digital Library

[11]

K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of features. Journal of Machine Learning Research, 2007.

Digital Library

[12]

W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 1963.

[13]

H. Jégou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. International Journal of Computer Vision, 2010.

Digital Library

[14]

F. Jurie and B. Triggs. Creating efficient codebooks for visual recognition. In Computer Vision and Pattern Recognition, 2005.

Digital Library

[15]

Y. Ke, R. Sukthankar, and L. Huston. Efficient near-duplicate detection and sub-image retrieval. In ACM Multimedia, 2004.

Digital Library

[16]

D. Li, L. Yang, X.-S. Hua, and H.-J. Zhang. Large-scale robust visual codebook construction. In ACM Multimedia, 2010.

Digital Library

[17]

F. Li, W. Tong, R. Jin, A. K. Jain, and J.-E. Lee. An efficient key point quantization algorithm for large scale image retrieval. In ACM workshop on Large-scale multimedia retrieval and mining, 2009.

Digital Library

[18]

D. Lowe. Distinctive image features from scale-invariant keypoints. In International Journal of Computer Vision, 2004.

Digital Library

[19]

S. Lyu. Mercer kernels for object recognition with local features. In Computer Vision and Pattern Recognition, 2005.

Digital Library

[20]

M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In International Conference on Computer Vision Theory and Application (VISSAPP'09), 2009.

[21]

D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Computer Vision and Pattern Recognition, 2006.

Digital Library

[22]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Computer Vision and Pattern Recognition, 2007.

[23]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In Computer Vision and Pattern Recognition, 2008.

[24]

J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In International Conference on Computer Vision, 2003.

Digital Library

[25]

T. Tuytelaars and C. Schmid. Vector quantizing feature space with a regular lattice. In International Conference on Computer Vision, 2007.

[26]

X. Wu, A. G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In ACM Multimedia, 2007.

Digital Library

[27]

L. Yang, Y. Cai, A. Hanjalic, X.-S. Hua, and S. Li. Video-based image retrieval. In ACM Multimedia, 2011.

Digital Library

[28]

L. Yang, B. Geng, Y. Cai, A. Hanjalic, and X.-S. Hua. Object retrieval using visual query context. IEEE Transactions on Multimedia, 2011.

Digital Library

[29]

Y. Yang, F. Nie, D. Xu, J. Luo, Y. Zhuang, and Y. Pan. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012.

Digital Library

[30]

Y. Yang, Y.-T. Zhuang, F. Wu, and Y.-H. Pan. Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Transactions on Multimedia, 2008.

Digital Library

[31]

W.-L. Zhao, S. Tan, and C.-W. Ngo. Large-scale near-duplicate web video search: challenge and opportunity. In International Conference on Multimedia and Expo, 2009.

Digital Library

Cited By

Yang YTian YHuang T(2019)Multiscale video sequence matching for near-duplicate detection and retrievalMultimedia Tools and Applications10.1007/s11042-018-5862-378:1(311-336)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.1007/s11042-018-5862-3
Zheng LYang YTian Q(2018)SIFT Meets CNN: A Decade Survey of Instance RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.270974940:5(1224-1244)Online publication date: 1-May-2018
https://doi.org/10.1109/TPAMI.2017.2709749
Zhen MWang WWang RKender JSmith JLuo JBoll SHsu W(2016)Regional Subspace Projection Coding for Image RetrievalProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912003(205-212)Online publication date: 6-Jun-2016
https://dl.acm.org/doi/10.1145/2911996.2912003
Show More Cited By

Index Terms

Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations

Recommendations

Local Deep Descriptors in Bag-of-Words for Image Retrieval
Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017

The Bag-of-Words (BoW) models using the SIFT descriptors have achieved great success in content-based image retrieval over the past decade. Recent studies show that the neuron activations of the convolutional neural networks (CNN) can be viewed as local ...
Constrained and recursive hierarchical table-lookup vector quantization
DCC '96: Proceedings of the Conference on Data Compression

This paper presents techniques for the design of generic constrained and recursive vector quantizer encoders implemented by table-lookups. These vector quantizers include entropy-constrained VQ, tree structured VQ, classified VQ, product VQ, mean-...
Color Directional Local Quinary Patterns for Content Based Indexing and Retrieval

This paper presents a novel evaluationary approach to extract color-texture features for image retrieval application namely Color Directional Local Quinary Pattern (CDLQP). The proposed descriptor extracts the individual R, G and B channel wise ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '12: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval

June 2012

489 pages

ISBN:9781450313292

DOI:10.1145/2324796

Conference Chairs:
Horace H. S. Ip
City University of Hong Kong
,
Yong Rui
Microsoft, China

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Division of Information and Intelligent Systems

Conference

ICMR '12

Sponsor:

SIGMM

ICMR '12: International Conference on Multimedia Retrieval

June 5 - 8, 2012

Hong Kong, China

Acceptance Rates

ICMR '12 Paper Acceptance Rate 50 of 145 submissions, 34%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
414
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang YTian YHuang T(2019)Multiscale video sequence matching for near-duplicate detection and retrievalMultimedia Tools and Applications10.1007/s11042-018-5862-378:1(311-336)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.1007/s11042-018-5862-3
Zheng LYang YTian Q(2018)SIFT Meets CNN: A Decade Survey of Instance RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2017.270974940:5(1224-1244)Online publication date: 1-May-2018
https://doi.org/10.1109/TPAMI.2017.2709749
Zhen MWang WWang RKender JSmith JLuo JBoll SHsu W(2016)Regional Subspace Projection Coding for Image RetrievalProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912003(205-212)Online publication date: 6-Jun-2016
https://dl.acm.org/doi/10.1145/2911996.2912003
Tian YQian MHuang T(2015)TASCACM Transactions on Information Systems10.1145/269966233:2(1-34)Online publication date: 17-Feb-2015
https://dl.acm.org/doi/10.1145/2699662
Liang Zheng Shengjin Wang Ziqiong Liu Qi Tian (2015)Fast Image Retrieval: Query Pruning and Early TerminationIEEE Transactions on Multimedia10.1109/TMM.2015.240856317:5(648-659)Online publication date: 1-May-2015
https://dl.acm.org/doi/10.1109/TMM.2015.2408563
Zheng LWang SGuo PLiang HTian Q(2015)Tensor index for large scale image retrievalMultimedia Systems10.1007/s00530-014-0415-821:6(569-579)Online publication date: 1-Nov-2015
https://dl.acm.org/doi/10.1007/s00530-014-0415-8
Liang Zheng Shengjin Wang Qi Tian (2014) \(\mathcal {L}_p\) -Norm IDF for Scalable Image RetrievalIEEE Transactions on Image Processing10.1109/TIP.2014.232918223:8(3604-3617)Online publication date: Aug-2014
https://doi.org/10.1109/TIP.2014.2329182
Lopes BGoularte RPrazeres CSampaio PSantanchè ASantos CGoularte R(2013)Multimodal late fusion bag of features applied to scene detectionProceedings of the 19th Brazilian symposium on Multimedia and the web10.1145/2526188.2526202(15-22)Online publication date: 5-Nov-2013
https://dl.acm.org/doi/10.1145/2526188.2526202
Zheng LWang SLiu ZTian Q(2013)Lp-Norm IDF for Large Scale Image SearchProceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2013.213(1626-1633)Online publication date: 23-Jun-2013
https://dl.acm.org/doi/10.1109/CVPR.2013.213

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten