Abstract
This article approaches scene classification problem by proposing an enhanced bag of features (BoF) model and a modified radial basis function neural network (RBFNN) classifier. The proposed BoF model integrates the image features extracted by histogram of oriented gradients, local binary pattern and wavelet coefficients. The extracted features are obtained in a hierarchical multi-resolution manner. The proposed approach is able to capture multi-level (the pixel-, patch-, and image-level) features. The histograms of features constructed by BoF model are then used for training a modified RBFNN classifier. As a modification, we propose using a new variant of particle swarm optimization, in which the parameters are updated adaptively, for determining the center of Gaussian functions in RBFNN. Experimental results demonstrate that our proposed approach significantly outperforms the state-of-the-art methods on scene classification of OT, FP, and LSP benchmark datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alexandridis A, Chondrodima E, Sarimveis H (2013) Radial basis function network training using a nonsymmetric partition of the input space and particle swarm optimization. IEEE Trans Neural Netw Learn 24(2):219–230. doi:10.1109/Tnnls.2012.2227794
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine. http://archive.ics.uci.edu/ml/
Bolovinou A, Pratikakis I, Perantonis S (2013) Bag of spatio-visual words for context inference in scene classification. Pattern Recognit 46(3):1039–1053. doi:10.1016/j.patcog.2012.07.024
Bosch A, Zisserman A, Munoz X (2006) Scene classification via pLSA. In: Proceedings of computer vision-Eccv 2006, Pt 4, 3954: 517–530
Bosch A, Zisserman A, Munoz X (2007) Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM international conference on image and video retrieval. ACM, pp 401–408
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken
Cancelliere R, Gai M (2003) A comparative analysis of neural network performances in astronomical imaging. Appl Numer Math 45(1):87–98. doi:10.1016/S0168-9274(02)00237-4
Chan T, Jia K, Gao S, Lu J, Zeng Z, Ma YP (2014) A simple deep learning baseline for image classification? arXiv preprint. arXiv preprint arXiv:1404.3606 1(3)
Chen XY (2007) Deformation measurement of the large flexible surface by improved RBFNN algorithm and BPNN algorithm. In: Proceedings advances in neural networks-ISNN 2007, Pt 3, 4493: 41–48
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of CVPR IEEE, pp 886–893
Dong C-R, Chan PP, Ng WW, Yeung DS (2011) A survey of the initialization of centers and widths in radial basis function network for classification. In: 2011 IEEE international conference on machine learning and cybernetics (ICMLC), pp 1082–1087
Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Fan H, Zhou E (2016) Approaching human level facial landmark localization by deep learning. Image Vis Comput 47:27–35
Farhidzadeh H, Zhou M, Goldgof DB, Hall LO, Raghavan M, Gatenby RA (2014) Prediction of treatment response and metastatic disease in soft tissue sarcoma. Med Imaging Comput Aided Diagn. doi:10.1117/12.2043792
Farhidzadeh H, Chaudhury B, Zhou M, Goldgof DB, Hall LO, Gatenby RA, Gillies RJ, Raghavan M (2015) Prediction of treatment outcome in soft tissue sarcoma based on radiologically defined habitats. Proc Spie. doi:10.1117/12.2082324
Farhidzadeh H, Goldgof DB, Hall LO, Gatenby RA, Gillies RJ, Raghavan M (2015) Texture feature analysis to predict metastatic and necrotic soft tissue sarcomas. IEEE Syst Man Cybern. doi:10.1109/Smc.2015.488
Farhidzadeh H, Kim JY, Scott JG, Goldgof DB, Hall LO, Harrison LB (2016) Classification of progression free survivalwith nasopharyngeal carcinoma tumors. In: SPIE medical imaging, international society for optics and photonics
Fathi V, Montazer GA (2013) An improvement in RBF learning algorithm based on PSO for real time applications. Neurocomputing 111:169–176. doi:10.1016/j.neucom.2012.12.024
Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories 2005. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, 2: 524–531
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67
Heikkila M, Pietikainen M, Schmid C (2009) Description of interest regions with local binary patterns. Pattern Recognit 42(3):425–436. doi:10.1016/j.patcog.2008.08.014
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Holmes CC, Mallick BK (2000) Bayesian wavelet networks for nonparametric regression. IEEE Trans Neural Netw 11(1):27–35. doi:10.1109/72.822507
Huang X, Li SZ, Wang Y (2004) Shape localization based on statistical method using extended local binary pattern. In: Third international conference on IEEE image and graphics (ICIG’04), pp 184–187
Jain AK, Farrokhnia F (1990) Unsupervised texture segmentation using gabor filters. In: 1990 IEEE international conference on systems, man, and cybernetics: 14–19. doi:10.1109/Icsmc.1990.142050
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp 1097–1105
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). pp 2169–2178
Lin IC, Liou CY (2007) Least-mean-square training of cluster-weighted modeling. In: Proceedings artificial neural networks-ICANN, Pt 2, 4669: 301–310
Loo CK, Rajeswari M, Rao MVC (2004) Novel direct and self-regulating approaches to determine optimum growing multi-experts network structure. IEEE Trans Neural Netw 15(6):1378–1395. doi:10.1109/Tnn.2004.837779
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110. doi:10.1023/B:Visi.0000029664.99615.94
Meng X, Wang Z, Wu L (2012) Building global image features for scene recognition. Pattern Recognit 45(1):373–380
Montazer GA, Sabzevari R, Khatir HG (2007) Improvement of learning algorithms for RBF neural networks in a helicopter sound identification system. Neurocomputing 71(1–3):167–173. doi:10.1016/j.neucom.2007.08.002
Montazer GA, Sabzevari R, Ghorbani F (2009) Three-phase strategy for the OSD learning method in RBF neural networks. Neurocomputing 72(7–9):1797–1802. doi:10.1016/j.neucom.2008.05.011
Montazer GA, Soltanshahi MA, Giveki D (2015) Extended bag of visual words for face detection. Adv Comput Intell 9094:503–510. doi:10.1007/978-3-319-19258-1_41 (Pt I Iwann 2015)
Montazer GA, Giveki D (2015) Content based image retrieval system using clustered scale invariant feature transforms. Opt Int J Light Electron Opt 126(18):1695–1699
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal 24(7):971–987. doi:10.1109/Tpami.2002.1017623
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175. doi:10.1023/A:1011139631724
Pang YW, Yan H, Yuan Y, Wang KQ (2012) Robust CoHOG feature extraction in human-centered image/video management system. IEEE Trans Syst Man Cybern B 42(2):458–468. doi:10.1109/Tsmcb.2011.2167750
Patrinos P, Alexandridis A, Ninos K, Sarimveis H (2010) Variable selection in nonlinear modeling based on RBF networks and evolutionary computation. Int J Neural Syst 20(05):365–379
Prechelt L (1994) Proben1: A set of neural network benchmark problems and benchmarking rules
Qin J, Yung NH (2010) Scene categorization via contextual visual words. Pattern Recognit 43(5):1874–1888
Qin J, Yung NH (2012) Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit 45(4):1671–1683
Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T, Van Gool L (2005) Modeling scenes with local descriptors and latent aspects. In: IEEE international conference on computer vision: 883–890
Samad T (1991) Back propagation with expected source values. Neural Netw 4(5):615–618
Shi ZW, Han M (2007) Support vector echo-state machine for chaotic time-series prediction. IEEE Trans Neural Netw 18(2):359–372. doi:10.1109/Tnn.2006.885113
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Song TC, Li HL (2013) Local polar DCT features for image description. IEEE Signal Proc Lett 20(1):59–62. doi:10.1109/Lsp.2012.2229273
Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038
Tian XL, Jiao LC, Liu XL, Zhang XH (2014) Feature integration of EODH and color-SIFT: application to image retrieval based on codebook. Signal Process Image 29(4):530–545. doi:10.1016/j.image.2014.01.010
Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE international conference on computer vision. pp 1904–1912
Walia E, Pal A (2014) Fusion framework for effective color image retrieval. J Vis Commun Image Represent 25(6):1335–1348. doi:10.1016/j.jvcir.2014.05.005
Wang XY, Han TX, Yan SC (2009) An HOG-LBP human detector with partial occlusion handling. In: 2009 IEEE 12th international conference on computer vision (ICCV), pp 32–39. doi:10.1109/Iccv.2009.5459207
Wang W, Yang X, Ooi BC, Zhang D, Zhuang Y (2016) Effective deep learning-based multi-modal retrieval. VLDB J 25(1):79–101
Wang Y, Gong S (2007) Conditional random field for natural scene categorization. In: BMVC. Citeseer, pp 1–10
Wang R, Tao D (2016) Non-local auto-encoder with collaborative stabilization for image restoration. IEEE Trans Image Process 25(5):2117–2129
Wang S, Wang Y, Zhu S-C (2012) Hierarchical space tiling for scene modeling. In: Asian conference on computer vision. Springer, pp 796–810
Wang N, Yeung D-Y (2013) Learning a deep compact image representation for visual tracking. In: Advances in neural information processing systems. pp 809–817
Wu JX, Rehg JM (2011) CENTRIST: a visual descriptor for scene categorization. IEEE Trans Pattern Anal 33(8):1489–1501. doi:10.1109/Tpami.2010.224
Xu C, Tao D, Xu C (2013) A survey on multi-view learning. arXiv preprint arXiv:1304.5634
Yu J, Qin ZC, Wan T, Zhang X (2013) Feature integration analysis of bag-of-features model for image retrieval. Neurocomputing 120:355–364. doi:10.1016/j.neucom.2012.08.061
Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442
Zhang WC, Shan SG, Gao W, Chen XL, Zhang HM (2005) Local gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition. IEEE Int Conf Comput Vis 1:786–791
Zhang S, Tian Q, Hua G, Huang Q, Gao W (2014) ObjectPatchNet: towards scalable and semantic image annotation and retrieval. Comput Vis Image Underst 118:16–29
Zheng YB, Huang XS, Feng SJ (2010) An image matching algorithm based on combination of SIFT and the rotation invariant LBP. J Comput Aided Design Comput Gr 22(2):286–292
Zhou L, Zhou Z, Hu D (2013) Scene classification using a multi-resolution bag-of-features model. Pattern Recognit 46(1):424–433
Zhu Z, Wang X, Bai S, Yao C, Bai X (2016) Deep learning representation using autoencoder for 3d shape retrieval. Neurocomputing 204:41–50
Acknowledgements
The authors are grateful to the anonymous reviewers for the insightful comments and constructive suggestions. Part of this research has been funded by Iranian Research Institute for Information Science and Technology (IranDoc) (No. TMU92-03-44).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Montazer, G.A., Giveki, D. Scene Classification Using Multi-Resolution WAHOLB Features and Neural Network Classifier. Neural Process Lett 46, 681–704 (2017). https://doi.org/10.1007/s11063-017-9614-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9614-6