[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1937728.1937746acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicimcsConference Proceedingsconference-collections
research-article

Optimal operations for visual categorization

Published: 30 December 2010 Publication History

Abstract

Bag-of-words is the state-of-the-art method used in visual categorization. The performance of visual categorization depends on four main operations: the detection of interest point, the description of interest point, the design of classifier, and the construction of codebook. In this paper, we focus on the optimizations of the first three operations. Firstly, we compare several popular detectors of interest points and propose an optimal detector combined MSER detector with Hessian-Laplace detector to sample the key points. This detector well combines the interest region with the interest point such that the image can be represented in a hierarchical way. Secondly, we adopt SIFT to describe the sampling region because our experiment results demonstrate that SIFT is more robust than other popular descriptors. Thirdly, we use SVM with RBF kernel for object classification. The proposed classifier outperforms other classifier in terms of the classification accuracy. In order to verify three proposed optimal operations, we implement them in two image datasets: Caltech and KTH-TIPS. The experimental results show that our optimal operations can increase the accuracy of object categorization.

References

[1]
D. Larlus and F. Jurie. Latent mixture vocabularies for object categorization. In British Machine Vision Conference, 2006.
[2]
D. Lowe. Object recognition from local scale-invariant features. In International Conference on Computer Vision, 1999.
[3]
F. Moosmann, B. Triggs, and F. Jurie. Randomized clustering forests for building fast and discriminative visual vocabularies. In NIPS, 2007.
[4]
F. Perronnin, C. Dance, G. Csurka, and M. Bressian. Adapted for generic visual categorization. In ECCV, 2006.
[5]
A. E. Freeman, W. T. The design and use of steerable filter. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9).
[6]
G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV Workshop on Statistical Learning in Computer Vision, 2004.
[7]
H. Bay, T. Tuytelaars, and L. V. Gool. Surf: Speeded up robust features. In The Ninth European Conference on Computer Vision, 2006.
[8]
J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-Taylor. Improving ąřbag-of-keypointsąś image categorisation. In Technical report, University of Southampton.
[9]
J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide-baseline stereo from maximally stable extremal regions. In Proceedings of the British Machine Vision Conference, pages 384--393, 2002.
[10]
J. Winn, A. Criminisi, and T. Minka. Object categorization by learned universal visual dictionary. In International Conference on Computer Vision, 2005.
[11]
J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classi?cation of texture and object categories: a comprehensive study. International Journal of Computer Vision, 73(2):213--238, 2007.
[12]
K. Mikolajczyk and C. Schmid. Scale and affine invariant interest point detectors. International Journal of Computer Vision, 60(1):63--86, 2004.
[13]
K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.
[14]
H. Ling and D. W. Jacobi. Deformable invariant image matching. In ICCV, 2005.
[15]
E. Nowak, F. Jurie, and B. Triggs. Sampling strategies for bag-of-features image classification. In ECCV, pages 490--503, 2006.
[16]
S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape context. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(24).
[17]
S. Lazebnike, C. Schmid, and J. Ponce. A maximum entropy framework for part-based texture and object recognition. In ICCV, 2005.
[18]
S. Lazebnike, C. Schmid, and J. Ponce. A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8):1265--1278, 2005.
[19]
L. Yang, R. Jin, R. Sukthankar, and F. Jurie. Unifying discriminative visual dictionary generation with classifier training for object category recognition. In CVPR, 2008.

Cited By

View all
  • (2018)Evaluation of local features and classifiers in BOW model for image classificationMultimedia Tools and Applications10.1007/s11042-012-1107-z70:2(605-624)Online publication date: 31-Dec-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICIMCS '10: Proceedings of the Second International Conference on Internet Multimedia Computing and Service
December 2010
218 pages
ISBN:9781450304603
DOI:10.1145/1937728
  • General Chairs:
  • Yong Rui,
  • Klara Nahrstedt,
  • Xiaofei Xu,
  • Program Chairs:
  • Hongxun Yao,
  • Shuqiang Jiang,
  • Jian Cheng
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 December 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bag of words
  2. interest point detector
  3. local feature
  4. visual categorization

Qualifiers

  • Research-article

Funding Sources

Conference

ICIMCS '10

Acceptance Rates

Overall Acceptance Rate 163 of 456 submissions, 36%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Evaluation of local features and classifiers in BOW model for image classificationMultimedia Tools and Applications10.1007/s11042-012-1107-z70:2(605-624)Online publication date: 31-Dec-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media