[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2542050.2542077acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

Toward a practical visual object recognition system

Published: 05 December 2013 Publication History

Abstract

Recent researches in cognitive science and document recognition have been applied to deal with the problem of categorizing object. Bag-of-Features (BoF) and its extension Spatial Pyramid Matching (SPM) have made a breakthrough in resolving this kind of challenges. Many methods followed this guideline really enhance the recognition accuracy but still have drawbacks in developing a real-world application whose data size is many times bigger.
In this paper we propose two kinds of strategy include five criteria to evaluate and select the most appropriate training samples using for building a high performance classifier. We also suggest a method called reinforcement codebook learning to make the codebook training process not only purpose-built to best fits with the most suitable criteria but also much more efficient by reducing significantly its complexity of computation. Experiments on benchmark object dataset demonstrate that our proposed framework outperforms remarkable results and is comparable with the state-of-the-art in spite of using just 20% of 9 · 106 descriptors for training the dictionary. These results give a promise of building a efficient and feasible object categorization system for practical application as so as suggest some ideas to improve the visual feature representation in future.

References

[1]
Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. Surf: Speeded up robust features. In Computer Vision--ECCV 2006. Springer, 404--417.
[2]
Oren Boiman, Eli Shechtman, and Michal Irani. In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. 1--8.
[3]
Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, Vol. 1. 886--893.
[4]
Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In Computer Vision and Pattern Recognition Workshop, 2004. CVPRW'04. Conference on. 178--178.
[5]
Greg Griffin, Alex Holub, and Pietro Perona. 2007. The caltech-256. (2007).
[6]
Prateek Jain, Brian Kulis, and Kristen Grauman. Fast image search for learned metrics. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. 1--8.
[7]
Kevin Jarrett, Koray Kavukcuoglu, Marc'Aurelio Ranzato, and Yann LeCun. What is the best multi-stage architecture for object recognition?. In Computer Vision, 2009 IEEE 12th International Conference on. 2146--2153.
[8]
Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, Vol. 2. 2169--2178.
[9]
B Boser Le Cun, JS Denker, D Henderson, RE Howard, W Hubbard, and LD Jackel. Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems.
[10]
David G Lowe. Object recognition from local scale-invariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on, Vol. 2. 1150--1157.
[11]
Y Marc' Aurelio Ranzato, Lan Boureau, and Yann LeCun. 2007. Sparse feature learning for deep belief networks. Advances in neural information processing systems 20 (2007), 1185--1192.
[12]
Josef Sivic and Andrew Zisserman. Video Google: A text retrieval approach to object matching in videos. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on.
[13]
Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong. Locality-constrained linear coding for image classification. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. 3360--3367.
[14]
John Wright, Allen Y Yang, Arvind Ganesh, Shankar S Sastry, and Yi Ma. 2009. Robust face recognition via sparse representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on 31, 2 (2009), 210--227.
[15]
Jianchao Yang, Kai Yu, Yihong Gong, and Thomas Huang. Linear spatial pyramid matching using sparse coding for image classification. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. 1794--1801.
[16]
Kai Yu, Tong Zhang, and Yihong Gong. 2009. Nonlinear learning using local coordinate coding. Advances in Neural Information Processing Systems 22 (2009), 2223--2231.
[17]
Matthew D Zeiler, Graham W Taylor, and Rob Fergus. Adaptive deconvolutional networks for mid and high level feature learning. In Computer Vision (ICCV), 2011 IEEE International Conference on. 2018--2025.

Index Terms

  1. Toward a practical visual object recognition system

                      Recommendations

                      Comments

                      Please enable JavaScript to view thecomments powered by Disqus.

                      Information & Contributors

                      Information

                      Published In

                      cover image ACM Other conferences
                      SoICT '13: Proceedings of the 4th Symposium on Information and Communication Technology
                      December 2013
                      345 pages
                      ISBN:9781450324540
                      DOI:10.1145/2542050
                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Sponsors

                      • SOICT: School of Information and Communication Technology - HUST
                      • NAFOSTED: The National Foundation for Science and Technology Development
                      • ACM Vietnam Chapter: ACM Vietnam Chapter
                      • Danang Univ. of Technol.: Danang University of Technology

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      Published: 05 December 2013

                      Permissions

                      Request permissions for this article.

                      Check for updates

                      Author Tags

                      1. K-mean vector quantization
                      2. feature representation
                      3. reinforcement learning
                      4. samples selection
                      5. spatial pyramid matching

                      Qualifiers

                      • Research-article

                      Conference

                      SoICT '13
                      Sponsor:
                      • SOICT
                      • NAFOSTED
                      • ACM Vietnam Chapter
                      • Danang Univ. of Technol.

                      Acceptance Rates

                      SoICT '13 Paper Acceptance Rate 40 of 80 submissions, 50%;
                      Overall Acceptance Rate 147 of 318 submissions, 46%

                      Contributors

                      Other Metrics

                      Bibliometrics & Citations

                      Bibliometrics

                      Article Metrics

                      • 0
                        Total Citations
                      • 72
                        Total Downloads
                      • Downloads (Last 12 months)0
                      • Downloads (Last 6 weeks)0
                      Reflects downloads up to 20 Dec 2024

                      Other Metrics

                      Citations

                      View Options

                      Login options

                      View options

                      PDF

                      View or Download as a PDF file.

                      PDF

                      eReader

                      View online with eReader.

                      eReader

                      Media

                      Figures

                      Other

                      Tables

                      Share

                      Share

                      Share this Publication link

                      Share on social media