[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Discriminative compact pyramids for object and scene recognition

Published: 01 April 2012 Publication History

Abstract

Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.

References

[1]
Zhang, J., Marszalek, M., Lazebnik, S. and Schmid, C., Local features and kernels for classification of texture and object categories: an in-depth study. A comprehensive study. International Journal of Computer Vision. v73 i2. 213-218.
[2]
Csurka, G., Bray, C., Dance, C. and Fan, L., Visual categorization with bags of key points. In: Workshop on Statistical Learning in Computer Vision, ECCV,
[3]
Bosch, A., Zisserman, A. and Munoz, X., Scene classification via PLSA. In: Proceedings of European Conference on Computer Vision,
[4]
Bosch, A., Zisserman, A. and Munoz, X., Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence. v30 i4. 712-727.
[5]
Dorko, G. and Schmid, C., Selection of scale-invariant parts for object class recognition. In: Proceedings of the IEEE International Conference on Computer Vision,
[6]
Fei-Fei, L. and Perona, P., A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the Computer Vision and Pattern Recognition,
[7]
Lazebnik, S., Schmid, C. and Ponce, J., A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence. v27 i8. 1265-1278.
[8]
Mikolajczyk, K. and Schmid, C., A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence. v27 i10. 1615-1630.
[9]
Quelhas, P., Monay, F., Odobez, J., Gatica-Perez, D., Tuytelaars, T. and Gool, L.V., Modelling scenes with local descriptors and latent aspects. In: Proceedings of the IEEE International Conference on Computer Vision,
[10]
van de Sande, K.E.A., Gevers, T. and Snoek, C.G.M., Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. v32 i9. 1582-1596.
[11]
Lowe, D.G., Distinctive image features from scale-invariant points. International Journal of Computer Vision. v60 i2. 91-110.
[12]
Lazebnik, S., Schmid, C. and Ponce, J., Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the Computer Vision and Pattern Recognition,
[13]
Lampert, C., Blaschko, M. and Hofmann, T., Beyond sliding windows: object localization by efficient subwindow search. In: Proceedings of the Computer Vision and Pattern Recognition,
[14]
M. Everingham, L.V. Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2008 (voc2008) Results.
[15]
M. Everingham, L.V. Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2007 Results.
[16]
Wu, J., A fast dual method for HIK SVM learning. In: Proceedings of the European Conference on Computer Vision,
[17]
Wu, J. and Rehg, J., Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: Proceedings of the IEEE International Conference on Computer Vision,
[18]
Yang, J., Yu, K. and Huang, T., Efficient highly over-complete sparse coding using a mixture model. In: Proceedings of the European Conference on Computer Vision,
[19]
Winn, J., Criminisi, A. and Minka, T., Object categorization by learned universal visual dictionary. In: Proceedings of the IEEE International Conference on Computer Vision,
[20]
Fulkerson, B., Vedaldi, A. and Soatto, S., Localizing objects with smart dictionaries. In: Proceedings of the European Conference on Computer Vision,
[21]
Lazebnik, S. and Raginsky, M., Supervised learning of quantizer codebooks by information loss minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence. v31 i7. 1294-1309.
[22]
Slonim, N. and Tishby, N., Agglomerative information bottleneck. In: Advances in Neural Information Processing Systems,
[23]
Dhillon, I.S., Mallela, S., Kumar, R., Guyon, I. and Elisseeff, A., A divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research. v3. 1265-1287.
[24]
Gehler, P.V. and Nowozin, S., On feature combination for multiclass object classification. In: Proceedings of the IEEE International Conference on Computer Vision,
[25]
Burghouts, G.J. and Geusebroek, J.M., Performance evaluation of local colour invariants. Computer Vision and Image Understanding. v113. 48-62.
[26]
van de Weijer, J. and Schmid, C., Coloring local feature extraction. In: Proceedings of the European Conference on Computer Vision,
[27]
Li, L.-J. and Fei-Fei, L., What, where and who? classifying events by scene and object recognition. In: Proceedings of the IEEE International Conference on Computer Vision,
[28]
Larlus, D. and Jurie, F., Latent mixture vocabularies for object categorization and segmentation. Image and Vision Computing. v27 i5. 523-534.
[29]
Learning color names for real-world applications. IEEE Transactions on Image Processing. v18 i7. 1512-1524.
[30]
Shechtman, E. and Irani, M., Matching local self-similarities across images and videos. In: Proceedings of the Computer Vision and Pattern Recognition,
[31]
Maji, S., Berg, A.C. and Malik, J., Classification using intersection kernel support vector machines is efficient. In: Proceedings of the Computer Vision and Pattern Recognition,
[32]
M. Marszalek, C. Schmid, H. Harzallah, J. van de Weijer, Learning object representation for visual object class recognition, in: Visual Recognition Challenge Workshop, in Conjuncture with ICCV, 2007.
[33]
Lampert, C.H., Blaschko, M.B. and Hofmann, T., Efficient subwindow search: a branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence. v31 i12. 2129-2142.
[34]
Perronnin, F., Sánchez, J. and Liu, Y., Large-scale image categorization with explicit data embedding. In: Proceedings of the Computer Vision and Pattern Recognition,
[35]
Yang, J., Yu, K., Gong, Y. and Huang, T., Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of the Computer Vision and Pattern Recognition,
[36]
Mairal, J., Bach, F., Ponce, J. and Sapiro, G., Online dictionary learning for sparse coding. In: Proceedings of the International Conference on Machine Learning,
[37]
Bach, F.R., Lanckriet, G.R.G. and Jordan, M.I., Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the International Conference on Machine Learning,
[38]
Bosch, A., Zisserman, A. and Munoz, X., Representing shape with a spatial pyramid kernel. In: ACM International Conference on Image and Video Retrieval,
[39]
Gehler, P.V. and Nowozin, S., Let the kernel figure it out: principled learning of pre-processing for kernel classifiers. In: Proceedings of the Computer Vision and Pattern Recognition,
[40]
Rakotomamonjy, A., Bach, F., Canu, S. and Grandvalet, Y., More efficiency in multiple kernel learning. In: Proceedings of the International Conference on Machine Learning,
[41]
Varma, M. and Ray, D., Learning the discriminative power invariance trade-off. In: Proceedings of the IEEE International Conference on Computer Vision,
[42]
Khan, F.S., van de Weijer, J. and Vanrell, M., Top-down color attention for object recognition. In: Proceedings of the IEEE International Conference on Computer Vision,
[43]
Boureau, Y.-L., Bach, F., LeCun, Y. and Ponce, J., Learning mid-level features for recognition. In: Proceedings of the Computer Vision and Pattern Recognition,
[44]
Xie, N., Ling, H., Hu, W. and Zhang, X., Use bin-ratio information for category and scene classification. In: Proceedings of the Computer Vision and Pattern Recognition,
[45]
Wang, Z., Hu, Y. and Chia, L.-T., Image-to-class distance metric learning for image classification. In: Proceedings of the European Conference on Computer Vision,
[46]
Harzallah, H., Jurie, F. and Schmid, C., Combining efficient object localization and image classification. In: Proceedings of the IEEE International Conference on Computer Vision,
[47]
Jenatton, R., Mairal, J., Obozinski, G. and Bach, F., Proximal methods for sparse hierarchical dictionary learning. In: Proceedings of the International Conference on Machine Learning,
[48]
Yang, M., Zhang, L., Yang, J. and Zhang, D., Robust sparse coding for face recognition. In: Proceedings of the Computer Vision and Pattern Recognition,

Cited By

View all
  • (2022)Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images ClassificationPattern Recognition and Image Analysis10.1134/S105466182104003932:1(78-88)Online publication date: 1-Mar-2022
  • (2022)Semantic embedding: scene image classification using scene-specific objectsMultimedia Systems10.1007/s00530-022-01010-929:2(669-691)Online publication date: 18-Oct-2022
  • (2020)Optimal Nonparametric Bayesian Model-Based Multimodal BoVW Creation Using Multilayer pLSACircuits, Systems, and Signal Processing10.1007/s00034-019-01307-739:2(1123-1132)Online publication date: 1-Feb-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Recognition
Pattern Recognition  Volume 45, Issue 4
April, 2012
585 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 April 2012

Author Tags

  1. AIB
  2. Bag of features
  3. DITC
  4. Object and scene recognition
  5. Pyramid representation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images ClassificationPattern Recognition and Image Analysis10.1134/S105466182104003932:1(78-88)Online publication date: 1-Mar-2022
  • (2022)Semantic embedding: scene image classification using scene-specific objectsMultimedia Systems10.1007/s00530-022-01010-929:2(669-691)Online publication date: 18-Oct-2022
  • (2020)Optimal Nonparametric Bayesian Model-Based Multimodal BoVW Creation Using Multilayer pLSACircuits, Systems, and Signal Processing10.1007/s00034-019-01307-739:2(1123-1132)Online publication date: 1-Feb-2020
  • (2018)Scene recognition with objectnessPattern Recognition10.1016/j.patcog.2017.09.02574:C(474-487)Online publication date: 1-Feb-2018
  • (2017)Early versus Late Dimensionality Reduction of Bag-of-Words Feature Representation for Image ClassificationProceedings of the 4th International Conference on Bioinformatics Research and Applications10.1145/3175587.3175598(42-45)Online publication date: 8-Dec-2017
  • (2017)Generalized symmetric pair model for action classification in still imagesPattern Recognition10.1016/j.patcog.2016.10.00164:C(347-360)Online publication date: 1-Apr-2017
  • (2016)Unsupervised human action categorization with consensus information bottleneck methodProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060832.3060935(2245-2251)Online publication date: 9-Jul-2016
  • (2016)Understanding image concepts using ISTOP modelPattern Recognition10.1016/j.patcog.2015.11.01053:C(174-183)Online publication date: 1-May-2016
  • (2016)Practice makes perfectNeurocomputing10.1016/j.neucom.2016.01.091196:C(95-106)Online publication date: 5-Jul-2016
  • (2016)Visual descriptors for scene categorizationArtificial Intelligence Review10.1007/s10462-015-9448-445:3(333-368)Online publication date: 1-Mar-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media