More Web Proxy on the site http://driver.im/

research-article

Optimal operations for visual categorization

Authors:

Yi XieAuthors Info & Claims

ICIMCS '10: Proceedings of the Second International Conference on Internet Multimedia Computing and Service

Pages 73 - 76

https://doi.org/10.1145/1937728.1937746

Published: 30 December 2010 Publication History

Abstract

Bag-of-words is the state-of-the-art method used in visual categorization. The performance of visual categorization depends on four main operations: the detection of interest point, the description of interest point, the design of classifier, and the construction of codebook. In this paper, we focus on the optimizations of the first three operations. Firstly, we compare several popular detectors of interest points and propose an optimal detector combined MSER detector with Hessian-Laplace detector to sample the key points. This detector well combines the interest region with the interest point such that the image can be represented in a hierarchical way. Secondly, we adopt SIFT to describe the sampling region because our experiment results demonstrate that SIFT is more robust than other popular descriptors. Thirdly, we use SVM with RBF kernel for object classification. The proposed classifier outperforms other classifier in terms of the classification accuracy. In order to verify three proposed optimal operations, we implement them in two image datasets: Caltech and KTH-TIPS. The experimental results show that our optimal operations can increase the accuracy of object categorization.

References

[1]

D. Larlus and F. Jurie. Latent mixture vocabularies for object categorization. In British Machine Vision Conference, 2006.

[2]

D. Lowe. Object recognition from local scale-invariant features. In International Conference on Computer Vision, 1999.

Digital Library

[3]

F. Moosmann, B. Triggs, and F. Jurie. Randomized clustering forests for building fast and discriminative visual vocabularies. In NIPS, 2007.

[4]

F. Perronnin, C. Dance, G. Csurka, and M. Bressian. Adapted for generic visual categorization. In ECCV, 2006.

Digital Library

[5]

A. E. Freeman, W. T. The design and use of steerable filter. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9).

Digital Library

[6]

G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. In ECCV Workshop on Statistical Learning in Computer Vision, 2004.

[7]

H. Bay, T. Tuytelaars, and L. V. Gool. Surf: Speeded up robust features. In The Ninth European Conference on Computer Vision, 2006.

Digital Library

[8]

J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-Taylor. Improving ąřbag-of-keypointsąś image categorisation. In Technical report, University of Southampton.

[9]

J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide-baseline stereo from maximally stable extremal regions. In Proceedings of the British Machine Vision Conference, pages 384--393, 2002.

[10]

J. Winn, A. Criminisi, and T. Minka. Object categorization by learned universal visual dictionary. In International Conference on Computer Vision, 2005.

Digital Library

[11]

J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classi?cation of texture and object categories: a comprehensive study. International Journal of Computer Vision, 73(2):213--238, 2007.

Digital Library

[12]

K. Mikolajczyk and C. Schmid. Scale and affine invariant interest point detectors. International Journal of Computer Vision, 60(1):63--86, 2004.

Digital Library

[13]

K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.

Digital Library

[14]

H. Ling and D. W. Jacobi. Deformable invariant image matching. In ICCV, 2005.

Digital Library

[15]

E. Nowak, F. Jurie, and B. Triggs. Sampling strategies for bag-of-features image classification. In ECCV, pages 490--503, 2006.

Digital Library

[16]

S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape context. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(24).

Digital Library

[17]

S. Lazebnike, C. Schmid, and J. Ponce. A maximum entropy framework for part-based texture and object recognition. In ICCV, 2005.

Digital Library

[18]

S. Lazebnike, C. Schmid, and J. Ponce. A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8):1265--1278, 2005.

Digital Library

[19]

L. Yang, R. Jin, R. Sukthankar, and F. Jurie. Unifying discriminative visual dictionary generation with classifier training for object category recognition. In CVPR, 2008.

Cited By

Qu YWu SLiu HXie YWang H(2018)Evaluation of local features and classifiers in BOW model for image classificationMultimedia Tools and Applications10.1007/s11042-012-1107-z70:2(605-624)Online publication date: 31-Dec-2018
https://dl.acm.org/doi/10.1007/s11042-012-1107-z

Index Terms

Optimal operations for visual categorization
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing

Recommendations

Evaluation of local features and classifiers in BOW model for image classification

Bag-of-word (BOW) is used in many state-of-the-art methods of image classification, and it is especially suitable for multi-class classification. Many kinds of local features and classifiers are applicable for the BOW model. However, it is unclear which ...
A New Bag of Words LBP BoWL Descriptor for Scene Image Classification
CAIP 2013: Proceedings, Part I, of the 15th International Conference on Computer Analysis of Images and Patterns - Volume 8047

This paper explores a new Local Binary Patterns LBP based image descriptor that makes use of the bag-of-words model to significantly improve classification performance for scene images. Specifically, first, a novel multi-neighborhood LBP is introduced ...
Refining local descriptors by embedding semantic information for visual categorization
MM '11: Proceedings of the 19th ACM international conference on Multimedia

Local descriptor extraction and vector quantization are the important components of widely-used Bag-of-Features (BoF) model for visual categorization. This paper proposes a simple and efficient approach to refine the local descriptors for vector ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIMCS '10: Proceedings of the Second International Conference on Internet Multimedia Computing and Service

December 2010

218 pages

ISBN:9781450304603

DOI:10.1145/1937728

General Chairs:
Yong Rui
Microsoft China, China
,
Klara Nahrstedt
University of Illinois at Urbana-Champaign
,
Xiaofei Xu
Harbin Institute of Technology, China
,
Program Chairs:
Hongxun Yao
Harbin Institute of Technology, China
,
Shuqiang Jiang
Institute of Computing Technology, Chinese Academy of Science, China
,
Jian Cheng
Institute of Automation, Chinese Academy of Science, China

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 December 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ministry of Science and Technology of the People's Republic of China

Conference

ICIMCS '10

ICIMCS '10: The Second International Conference on Internet Multimedia Computing and Service

December 30 - 31, 2010

Harbin, China

Acceptance Rates

Overall Acceptance Rate 163 of 456 submissions, 36%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
61
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Qu YWu SLiu HXie YWang H(2018)Evaluation of local features and classifiers in BOW model for image classificationMultimedia Tools and Applications10.1007/s11042-012-1107-z70:2(605-624)Online publication date: 31-Dec-2018
https://dl.acm.org/doi/10.1007/s11042-012-1107-z

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents