More Web Proxy on the site http://driver.im/

article

Discriminative compact pyramids for object and scene recognition

Authors:

Noha M. Elfiky,

Fahad Shahbaz Khan,

Joost Van De Weijer,

Jordi GonzíLezAuthors Info & Claims

Pattern Recognition, Volume 45, Issue 4

Pages 1627 - 1636

https://doi.org/10.1016/j.patcog.2011.09.020

Published: 01 April 2012 Publication History

Abstract

Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.

References

[1]

Zhang, J., Marszalek, M., Lazebnik, S. and Schmid, C., Local features and kernels for classification of texture and object categories: an in-depth study. A comprehensive study. International Journal of Computer Vision. v73 i2. 213-218.

Digital Library

[2]

Csurka, G., Bray, C., Dance, C. and Fan, L., Visual categorization with bags of key points. In: Workshop on Statistical Learning in Computer Vision, ECCV,

[3]

Bosch, A., Zisserman, A. and Munoz, X., Scene classification via PLSA. In: Proceedings of European Conference on Computer Vision,

Digital Library

[4]

Bosch, A., Zisserman, A. and Munoz, X., Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence. v30 i4. 712-727.

Digital Library

[5]

Dorko, G. and Schmid, C., Selection of scale-invariant parts for object class recognition. In: Proceedings of the IEEE International Conference on Computer Vision,

Digital Library

[6]

Fei-Fei, L. and Perona, P., A Bayesian hierarchical model for learning natural scene categories. In: Proceedings of the Computer Vision and Pattern Recognition,

[7]

Lazebnik, S., Schmid, C. and Ponce, J., A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence. v27 i8. 1265-1278.

Digital Library

[8]

Mikolajczyk, K. and Schmid, C., A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence. v27 i10. 1615-1630.

Digital Library

[9]

Quelhas, P., Monay, F., Odobez, J., Gatica-Perez, D., Tuytelaars, T. and Gool, L.V., Modelling scenes with local descriptors and latent aspects. In: Proceedings of the IEEE International Conference on Computer Vision,

Digital Library

[10]

van de Sande, K.E.A., Gevers, T. and Snoek, C.G.M., Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. v32 i9. 1582-1596.

Digital Library

[11]

Lowe, D.G., Distinctive image features from scale-invariant points. International Journal of Computer Vision. v60 i2. 91-110.

Digital Library

[12]

Lazebnik, S., Schmid, C. and Ponce, J., Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the Computer Vision and Pattern Recognition,

Digital Library

[13]

Lampert, C., Blaschko, M. and Hofmann, T., Beyond sliding windows: object localization by efficient subwindow search. In: Proceedings of the Computer Vision and Pattern Recognition,

[14]

M. Everingham, L.V. Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2008 (voc2008) Results.

[15]

M. Everingham, L.V. Gool, C.K.I. Williams, J. Winn, A. Zisserman, The PASCAL Visual Object Classes Challenge 2007 Results.

[16]

Wu, J., A fast dual method for HIK SVM learning. In: Proceedings of the European Conference on Computer Vision,

Digital Library

[17]

Wu, J. and Rehg, J., Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: Proceedings of the IEEE International Conference on Computer Vision,

[18]

Yang, J., Yu, K. and Huang, T., Efficient highly over-complete sparse coding using a mixture model. In: Proceedings of the European Conference on Computer Vision,

Digital Library

[19]

Winn, J., Criminisi, A. and Minka, T., Object categorization by learned universal visual dictionary. In: Proceedings of the IEEE International Conference on Computer Vision,

Digital Library

[20]

Fulkerson, B., Vedaldi, A. and Soatto, S., Localizing objects with smart dictionaries. In: Proceedings of the European Conference on Computer Vision,

Digital Library

[21]

Lazebnik, S. and Raginsky, M., Supervised learning of quantizer codebooks by information loss minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence. v31 i7. 1294-1309.

Digital Library

[22]

Slonim, N. and Tishby, N., Agglomerative information bottleneck. In: Advances in Neural Information Processing Systems,

[23]

Dhillon, I.S., Mallela, S., Kumar, R., Guyon, I. and Elisseeff, A., A divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research. v3. 1265-1287.

Digital Library

[24]

Gehler, P.V. and Nowozin, S., On feature combination for multiclass object classification. In: Proceedings of the IEEE International Conference on Computer Vision,

[25]

Burghouts, G.J. and Geusebroek, J.M., Performance evaluation of local colour invariants. Computer Vision and Image Understanding. v113. 48-62.

Digital Library

[26]

van de Weijer, J. and Schmid, C., Coloring local feature extraction. In: Proceedings of the European Conference on Computer Vision,

Digital Library

[27]

Li, L.-J. and Fei-Fei, L., What, where and who? classifying events by scene and object recognition. In: Proceedings of the IEEE International Conference on Computer Vision,

[28]

Larlus, D. and Jurie, F., Latent mixture vocabularies for object categorization and segmentation. Image and Vision Computing. v27 i5. 523-534.

Digital Library

[29]

Learning color names for real-world applications. IEEE Transactions on Image Processing. v18 i7. 1512-1524.

Digital Library

[30]

Shechtman, E. and Irani, M., Matching local self-similarities across images and videos. In: Proceedings of the Computer Vision and Pattern Recognition,

[31]

Maji, S., Berg, A.C. and Malik, J., Classification using intersection kernel support vector machines is efficient. In: Proceedings of the Computer Vision and Pattern Recognition,

[32]

M. Marszalek, C. Schmid, H. Harzallah, J. van de Weijer, Learning object representation for visual object class recognition, in: Visual Recognition Challenge Workshop, in Conjuncture with ICCV, 2007.

[33]

Lampert, C.H., Blaschko, M.B. and Hofmann, T., Efficient subwindow search: a branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence. v31 i12. 2129-2142.

Digital Library

[34]

Perronnin, F., Sánchez, J. and Liu, Y., Large-scale image categorization with explicit data embedding. In: Proceedings of the Computer Vision and Pattern Recognition,

[35]

Yang, J., Yu, K., Gong, Y. and Huang, T., Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of the Computer Vision and Pattern Recognition,

[36]

Mairal, J., Bach, F., Ponce, J. and Sapiro, G., Online dictionary learning for sparse coding. In: Proceedings of the International Conference on Machine Learning,

Digital Library

[37]

Bach, F.R., Lanckriet, G.R.G. and Jordan, M.I., Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceedings of the International Conference on Machine Learning,

Digital Library

[38]

Bosch, A., Zisserman, A. and Munoz, X., Representing shape with a spatial pyramid kernel. In: ACM International Conference on Image and Video Retrieval,

Digital Library

[39]

Gehler, P.V. and Nowozin, S., Let the kernel figure it out: principled learning of pre-processing for kernel classifiers. In: Proceedings of the Computer Vision and Pattern Recognition,

[40]

Rakotomamonjy, A., Bach, F., Canu, S. and Grandvalet, Y., More efficiency in multiple kernel learning. In: Proceedings of the International Conference on Machine Learning,

Digital Library

[41]

Varma, M. and Ray, D., Learning the discriminative power invariance trade-off. In: Proceedings of the IEEE International Conference on Computer Vision,

[42]

Khan, F.S., van de Weijer, J. and Vanrell, M., Top-down color attention for object recognition. In: Proceedings of the IEEE International Conference on Computer Vision,

[43]

Boureau, Y.-L., Bach, F., LeCun, Y. and Ponce, J., Learning mid-level features for recognition. In: Proceedings of the Computer Vision and Pattern Recognition,

[44]

Xie, N., Ling, H., Hu, W. and Zhang, X., Use bin-ratio information for category and scene classification. In: Proceedings of the Computer Vision and Pattern Recognition,

[45]

Wang, Z., Hu, Y. and Chia, L.-T., Image-to-class distance metric learning for image classification. In: Proceedings of the European Conference on Computer Vision,

Digital Library

[46]

Harzallah, H., Jurie, F. and Schmid, C., Combining efficient object localization and image classification. In: Proceedings of the IEEE International Conference on Computer Vision,

[47]

Jenatton, R., Mairal, J., Obozinski, G. and Bach, F., Proximal methods for sparse hierarchical dictionary learning. In: Proceedings of the International Conference on Machine Learning,

[48]

Yang, M., Zhang, L., Yang, J. and Zhang, D., Robust sparse coding for face recognition. In: Proceedings of the Computer Vision and Pattern Recognition,

Cited By

Anami BSagarnal C(2022)Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images ClassificationPattern Recognition and Image Analysis10.1134/S105466182104003932:1(78-88)Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1134/S1054661821040039
Parseh MRahmanimanesh MKeshavarzi PAzimifar Z(2022)Semantic embedding: scene image classification using scene-specific objectsMultimedia Systems10.1007/s00530-022-01010-929:2(669-691)Online publication date: 18-Oct-2022
https://dl.acm.org/doi/10.1007/s00530-022-01010-9
Nagarajan GMinu RJayanthila Devi A(2020)Optimal Nonparametric Bayesian Model-Based Multimodal BoVW Creation Using Multilayer pLSACircuits, Systems, and Signal Processing10.1007/s00034-019-01307-739:2(1123-1132)Online publication date: 1-Feb-2020
https://dl.acm.org/doi/10.1007/s00034-019-01307-7
Show More Cited By

Recommendations

Compact and adaptive spatial pyramids for scene recognition

Most successful approaches on scene recognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our ...
Learning Shared, Discriminative, and Compact Representations for Visual Recognition
Dictionary-based and part-based methods are among the most popular approaches to visual recognition. In both methods, a mid-level representation is built on top of low-level image descriptors and high-level classifiers are trained on top of the mid-level ...
Discriminative-Element-Aware Sparse Representation for Action Recognition
ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and Service

Human action recognition has become one of the most important and challenging problems in computer vision. Sparse representation which encodes the local features with over-complete bases has shown excellent performance for action recognition. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Recognition

Pattern Recognition Volume 45, Issue 4

April, 2012

585 pages

ISSN:0031-3203

Issue’s Table of Contents

Copyright © Elsevier Ltd © 2011.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 April 2012

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Anami BSagarnal C(2022)Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images ClassificationPattern Recognition and Image Analysis10.1134/S105466182104003932:1(78-88)Online publication date: 1-Mar-2022
https://dl.acm.org/doi/10.1134/S1054661821040039
Parseh MRahmanimanesh MKeshavarzi PAzimifar Z(2022)Semantic embedding: scene image classification using scene-specific objectsMultimedia Systems10.1007/s00530-022-01010-929:2(669-691)Online publication date: 18-Oct-2022
https://dl.acm.org/doi/10.1007/s00530-022-01010-9
Nagarajan GMinu RJayanthila Devi A(2020)Optimal Nonparametric Bayesian Model-Based Multimodal BoVW Creation Using Multilayer pLSACircuits, Systems, and Signal Processing10.1007/s00034-019-01307-739:2(1123-1132)Online publication date: 1-Feb-2020
https://dl.acm.org/doi/10.1007/s00034-019-01307-7
Cheng XLu JFeng JYuan BZhou J(2018)Scene recognition with objectnessPattern Recognition10.1016/j.patcog.2017.09.02574:C(474-487)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1016/j.patcog.2017.09.025
Tsai CHu YLin WWang M(2017)Early versus Late Dimensionality Reduction of Bag-of-Words Feature Representation for Image ClassificationProceedings of the 4th International Conference on Bioinformatics Research and Applications10.1145/3175587.3175598(42-45)Online publication date: 8-Dec-2017
https://dl.acm.org/doi/10.1145/3175587.3175598
Zhao ZMa HChen X(2017)Generalized symmetric pair model for action classification in still imagesPattern Recognition10.1016/j.patcog.2016.10.00164:C(347-360)Online publication date: 1-Apr-2017
https://dl.acm.org/doi/10.1016/j.patcog.2016.10.001
Yan XYe YQiu X(2016)Unsupervised human action categorization with consensus information bottleneck methodProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3060832.3060935(2245-2251)Online publication date: 9-Jul-2016
https://dl.acm.org/doi/10.5555/3060832.3060935
Zarchi MTan Rvan Gemeren CMonadjemi AVeltkamp R(2016)Understanding image concepts using ISTOP modelPattern Recognition10.1016/j.patcog.2015.11.01053:C(174-183)Online publication date: 1-May-2016
https://dl.acm.org/doi/10.1016/j.patcog.2015.11.010
Ye ZLiu PLiu JTang XZhao W(2016)Practice makes perfectNeurocomputing10.1016/j.neucom.2016.01.091196:C(95-106)Online publication date: 5-Jul-2016
https://dl.acm.org/doi/10.1016/j.neucom.2016.01.091
Wei XPhung SBouzerdoum A(2016)Visual descriptors for scene categorizationArtificial Intelligence Review10.1007/s10462-015-9448-445:3(333-368)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1007/s10462-015-9448-4
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents