Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier

Gholam Ali Montazer^1,2 &
Davar Giveki²

261 Accesses
1 Altmetric
Explore all metrics

Abstract

This article approaches scene classification problem by proposing an enhanced bag of features (BoF) model and a modified radial basis function neural network (RBFNN) classifier. The proposed BoF model integrates the image features extracted by histogram of oriented gradients, local binary pattern and wavelet coefficients. The extracted features are obtained in a hierarchical multi-resolution manner. The proposed approach is able to capture multi-level (the pixel-, patch-, and image-level) features. The histograms of features constructed by BoF model are then used for training a modified RBFNN classifier. As a modification, we propose using a new variant of particle swarm optimization, in which the parameters are updated adaptively, for determining the center of Gaussian functions in RBFNN. Experimental results demonstrate that our proposed approach significantly outperforms the state-of-the-art methods on scene classification of OT, FP, and LSP benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Scene classification using a new radial basis function classifier and integrated SIFT–LBP features

Article 25 February 2020

Beyond SIFT for Image Categorization by Bag-of-Scenes Analysis

Transfer learning for image classification using VGG19: Caltech-101 image data set

Article 17 September 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Alexandridis A, Chondrodima E, Sarimveis H (2013) Radial basis function network training using a nonsymmetric partition of the input space and particle swarm optimization. IEEE Trans Neural Netw Learn 24(2):219–230. doi:10.1109/Tnnls.2012.2227794
Article Google Scholar
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine. http://archive.ics.uci.edu/ml/
Bolovinou A, Pratikakis I, Perantonis S (2013) Bag of spatio-visual words for context inference in scene classification. Pattern Recognit 46(3):1039–1053. doi:10.1016/j.patcog.2012.07.024
Article Google Scholar
Bosch A, Zisserman A, Munoz X (2006) Scene classification via pLSA. In: Proceedings of computer vision-Eccv 2006, Pt 4, 3954: 517–530
Bosch A, Zisserman A, Munoz X (2007) Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM international conference on image and video retrieval. ACM, pp 401–408
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken
MATH Google Scholar
Cancelliere R, Gai M (2003) A comparative analysis of neural network performances in astronomical imaging. Appl Numer Math 45(1):87–98. doi:10.1016/S0168-9274(02)00237-4
Article MATH Google Scholar
Chan T, Jia K, Gao S, Lu J, Zeng Z, Ma YP (2014) A simple deep learning baseline for image classification? arXiv preprint. arXiv preprint arXiv:1404.3606 1(3)
Chen XY (2007) Deformation measurement of the large flexible surface by improved RBFNN algorithm and BPNN algorithm. In: Proceedings advances in neural networks-ISNN 2007, Pt 3, 4493: 41–48
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of CVPR IEEE, pp 886–893
Dong C-R, Chan PP, Ng WW, Yeung DS (2011) A survey of the initialization of centers and widths in radial basis function network for classification. In: 2011 IEEE international conference on machine learning and cybernetics (ICMLC), pp 1082–1087
Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
MATH Google Scholar
Fan H, Zhou E (2016) Approaching human level facial landmark localization by deep learning. Image Vis Comput 47:27–35
Article Google Scholar
Farhidzadeh H, Zhou M, Goldgof DB, Hall LO, Raghavan M, Gatenby RA (2014) Prediction of treatment response and metastatic disease in soft tissue sarcoma. Med Imaging Comput Aided Diagn. doi:10.1117/12.2043792
Google Scholar
Farhidzadeh H, Chaudhury B, Zhou M, Goldgof DB, Hall LO, Gatenby RA, Gillies RJ, Raghavan M (2015) Prediction of treatment outcome in soft tissue sarcoma based on radiologically defined habitats. Proc Spie. doi:10.1117/12.2082324
Google Scholar
Farhidzadeh H, Goldgof DB, Hall LO, Gatenby RA, Gillies RJ, Raghavan M (2015) Texture feature analysis to predict metastatic and necrotic soft tissue sarcomas. IEEE Syst Man Cybern. doi:10.1109/Smc.2015.488
Google Scholar
Farhidzadeh H, Kim JY, Scott JG, Goldgof DB, Hall LO, Harrison LB (2016) Classification of progression free survivalwith nasopharyngeal carcinoma tumors. In: SPIE medical imaging, international society for optics and photonics
Fathi V, Montazer GA (2013) An improvement in RBF learning algorithm based on PSO for real time applications. Neurocomputing 111:169–176. doi:10.1016/j.neucom.2012.12.024
Article Google Scholar
Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories 2005. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, 2: 524–531
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67
Article MathSciNet MATH Google Scholar
Heikkila M, Pietikainen M, Schmid C (2009) Description of interest regions with local binary patterns. Pattern Recognit 42(3):425–436. doi:10.1016/j.patcog.2008.08.014
Article MATH Google Scholar
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Article MathSciNet MATH Google Scholar
Holmes CC, Mallick BK (2000) Bayesian wavelet networks for nonparametric regression. IEEE Trans Neural Netw 11(1):27–35. doi:10.1109/72.822507
Article Google Scholar
Huang X, Li SZ, Wang Y (2004) Shape localization based on statistical method using extended local binary pattern. In: Third international conference on IEEE image and graphics (ICIG’04), pp 184–187
Jain AK, Farrokhnia F (1990) Unsupervised texture segmentation using gabor filters. In: 1990 IEEE international conference on systems, man, and cybernetics: 14–19. doi:10.1109/Icsmc.1990.142050
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp 1097–1105
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). pp 2169–2178
Lin IC, Liou CY (2007) Least-mean-square training of cluster-weighted modeling. In: Proceedings artificial neural networks-ICANN, Pt 2, 4669: 301–310
Loo CK, Rajeswari M, Rao MVC (2004) Novel direct and self-regulating approaches to determine optimum growing multi-experts network structure. IEEE Trans Neural Netw 15(6):1378–1395. doi:10.1109/Tnn.2004.837779
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110. doi:10.1023/B:Visi.0000029664.99615.94
Article Google Scholar
Meng X, Wang Z, Wu L (2012) Building global image features for scene recognition. Pattern Recognit 45(1):373–380
Article Google Scholar
Montazer GA, Sabzevari R, Khatir HG (2007) Improvement of learning algorithms for RBF neural networks in a helicopter sound identification system. Neurocomputing 71(1–3):167–173. doi:10.1016/j.neucom.2007.08.002
Article Google Scholar
Montazer GA, Sabzevari R, Ghorbani F (2009) Three-phase strategy for the OSD learning method in RBF neural networks. Neurocomputing 72(7–9):1797–1802. doi:10.1016/j.neucom.2008.05.011
Article Google Scholar
Montazer GA, Soltanshahi MA, Giveki D (2015) Extended bag of visual words for face detection. Adv Comput Intell 9094:503–510. doi:10.1007/978-3-319-19258-1_41 (Pt I Iwann 2015)
Article MathSciNet Google Scholar
Montazer GA, Giveki D (2015) Content based image retrieval system using clustered scale invariant feature transforms. Opt Int J Light Electron Opt 126(18):1695–1699
Article Google Scholar
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal 24(7):971–987. doi:10.1109/Tpami.2002.1017623
Article MATH Google Scholar
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175. doi:10.1023/A:1011139631724
Article MATH Google Scholar
Pang YW, Yan H, Yuan Y, Wang KQ (2012) Robust CoHOG feature extraction in human-centered image/video management system. IEEE Trans Syst Man Cybern B 42(2):458–468. doi:10.1109/Tsmcb.2011.2167750
Article Google Scholar
Patrinos P, Alexandridis A, Ninos K, Sarimveis H (2010) Variable selection in nonlinear modeling based on RBF networks and evolutionary computation. Int J Neural Syst 20(05):365–379
Article Google Scholar
Prechelt L (1994) Proben1: A set of neural network benchmark problems and benchmarking rules
Qin J, Yung NH (2010) Scene categorization via contextual visual words. Pattern Recognit 43(5):1874–1888
Article MATH Google Scholar
Qin J, Yung NH (2012) Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit 45(4):1671–1683
Article Google Scholar
Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T, Van Gool L (2005) Modeling scenes with local descriptors and latent aspects. In: IEEE international conference on computer vision: 883–890
Samad T (1991) Back propagation with expected source values. Neural Netw 4(5):615–618
Article Google Scholar
Shi ZW, Han M (2007) Support vector echo-state machine for chaotic time-series prediction. IEEE Trans Neural Netw 18(2):359–372. doi:10.1109/Tnn.2006.885113
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Song TC, Li HL (2013) Local polar DCT features for image description. IEEE Signal Proc Lett 20(1):59–62. doi:10.1109/Lsp.2012.2229273
Article MathSciNet Google Scholar
Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038
Article Google Scholar
Tian XL, Jiao LC, Liu XL, Zhang XH (2014) Feature integration of EODH and color-SIFT: application to image retrieval based on codebook. Signal Process Image 29(4):530–545. doi:10.1016/j.image.2014.01.010
Article Google Scholar
Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision. pp 1904–1912
Walia E, Pal A (2014) Fusion framework for effective color image retrieval. J Vis Commun Image Represent 25(6):1335–1348. doi:10.1016/j.jvcir.2014.05.005
Article Google Scholar
Wang XY, Han TX, Yan SC (2009) An HOG-LBP human detector with partial occlusion handling. In: 2009 IEEE 12th international conference on computer vision (ICCV), pp 32–39. doi:10.1109/Iccv.2009.5459207
Wang W, Yang X, Ooi BC, Zhang D, Zhuang Y (2016) Effective deep learning-based multi-modal retrieval. VLDB J 25(1):79–101
Article Google Scholar
Wang Y, Gong S (2007) Conditional random field for natural scene categorization. In: BMVC. Citeseer, pp 1–10
Wang R, Tao D (2016) Non-local auto-encoder with collaborative stabilization for image restoration. IEEE Trans Image Process 25(5):2117–2129
Article MathSciNet Google Scholar
Wang S, Wang Y, Zhu S-C (2012) Hierarchical space tiling for scene modeling. In: Asian conference on computer vision. Springer, pp 796–810
Wang N, Yeung D-Y (2013) Learning a deep compact image representation for visual tracking. In: Advances in neural information processing systems. pp 809–817
Wu JX, Rehg JM (2011) CENTRIST: a visual descriptor for scene categorization. IEEE Trans Pattern Anal 33(8):1489–1501. doi:10.1109/Tpami.2010.224
Article Google Scholar
Xu C, Tao D, Xu C (2013) A survey on multi-view learning. arXiv preprint arXiv:1304.5634
Yu J, Qin ZC, Wan T, Zhang X (2013) Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120:355–364. doi:10.1016/j.neucom.2012.08.061
Article Google Scholar
Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442
Article Google Scholar
Zhang WC, Shan SG, Gao W, Chen XL, Zhang HM (2005) Local gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition. IEEE Int Conf Comput Vis 1:786–791
Google Scholar
Zhang S, Tian Q, Hua G, Huang Q, Gao W (2014) ObjectPatchNet: towards scalable and semantic image annotation and retrieval. Comput Vis Image Underst 118:16–29
Article Google Scholar
Zheng YB, Huang XS, Feng SJ (2010) An image matching algorithm based on combination of SIFT and the rotation invariant LBP. J Comput Aided Design Comput Gr 22(2):286–292
Google Scholar
Zhou L, Zhou Z, Hu D (2013) Scene classification using a multi-resolution bag-of-features model. Pattern Recognit 46(1):424–433
Article Google Scholar
Zhu Z, Wang X, Bai S, Yao C, Bai X (2016) Deep learning representation using autoencoder for 3d shape retrieval. Neurocomputing 204:41–50
Article Google Scholar

Download references

Acknowledgements

The authors are grateful to the anonymous reviewers for the insightful comments and constructive suggestions. Part of this research has been funded by Iranian Research Institute for Information Science and Technology (IranDoc) (No. TMU92-03-44).

Author information

Authors and Affiliations

Information Technology Engineering Department, School of Engineering, Tarbiat Modares University, P.O. Box 14115-179, Tehran, Iran
Gholam Ali Montazer
Iranian Research Institute for Information Science and Technology (IranDoc), Tehran, Iran
Gholam Ali Montazer & Davar Giveki

Authors

Gholam Ali Montazer
View author publications
You can also search for this author in PubMed Google Scholar
Davar Giveki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gholam Ali Montazer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Montazer, G.A., Giveki, D. Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier. Neural Process Lett 46, 681–704 (2017). https://doi.org/10.1007/s11063-017-9614-6

Download citation

Published: 03 April 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s11063-017-9614-6

Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scene classification using a new radial basis function classifier and integrated SIFT–LBP features

Beyond SIFT for Image Categorization by Bag-of-Scenes Analysis

Transfer learning for image classification using VGG19: Caltech-101 image data set

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scene classification using a new radial basis function classifier and integrated SIFT–LBP features

Beyond SIFT for Image Categorization by Bag-of-Scenes Analysis

Transfer learning for image classification using VGG19: Caltech-101 image data set

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation