Fractal dimension of bag-of-visual words

Lucas Correia Ribas¹,
Diogo Nunes Gonçalves²,
Jonathan de Andrade Silva²,
Amaury Antônio de Castro Jr.²,
Odemir Martinez Bruno³ &
…
Wesley Nunes Gonçalves²

397 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Scene recognition is an important and challenging problem in computer vision. One of the most used scene recognition methods is the bag-of-visual words. Despite the interesting results, this approach does not capture the detail richness of spatial information of the visual words on the image. In this paper, we propose a new method to describe the visual words using the fractal dimension. Our method estimates the fractal dimension of each visual word on image through box-counting method. The fractal dimension is capable of providing complex and spatial information of the visual words in a simple and efficient way. We validate our method on three well-known scene and object datasets, and the experimental results reveal that our method leads to highly discriminative features of the visual words. In addition, the proposed method has achieved competitive results compared to popular methods in scene classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Improvement the Bag of Words Image Representation Using Spatial Information

Discriminative Image Representation for Classification

Scale-space multi-view bag of words for scene categorization

Article 07 September 2020

References

Backes AR, Eler DM, Minghim R, Bruno OM (2010) Characterizing 3d shapes using fractal dimension. In: 15th Iberoamerican congress on pattern recognition. Springer, Berlin, pp 14–21
Bader M (2013) How to construct space-filling curves. Springer, Berlin, pp 15–30
Google Scholar
Bhattacharya P, Gavrilova M (2013) A survey of landmark recognition using the bag-of-words framework. In: Intelligent computer graphics 2012, chap. Springer, Berlin, pp 243–263. https://doi.org/10.1007/978-3-642-31745-3_13
Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: In workshop on statistical learning in computer vision, ECCV, pp 1–22
Cui Y, Cai Z, Lu W (2008) Scene recognition for mine rescue robot localization based on vision. Trans Nonferrous Metals Soc China 18(2):432–437
Article Google Scholar
Fei-Fei L, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Conference on computer vision and pattern recognition workshop, 2004. CVPRW ’04, pp 178–178. https://doi.org/10.1109/CVPR.2004.109
Gonçalves WN, Bruno OM (2013) Combining fractal and deterministic walkers for texture analysis and classification. Pattern Recognit. 46(11):2953–2968
Article MATH Google Scholar
Gonçalves WN, Machado BB, Bruno OM (2014) Texture descriptor combining fractal dimension and artificial crawlers. Physica A Stat Mech Appl 395:358–370. https://doi.org/10.1016/j.physa.2013.10.011
Article Google Scholar
Hearst M, Dumais S, Osman E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.org/10.1109/5254.708428
Article Google Scholar
Huang K, Wang C, Tao D (2015) High-order topology modeling of visual words for image classification. IEEE Trans Image Process 24:3598–3608. https://doi.org/10.1109/TIP.2015.2449081
Article MathSciNet Google Scholar
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31(8):651–666
Article Google Scholar
Jiang Y, Yuan J, Yu G (2012) Randomized spatial partition for scene recognition. In: Computer vision–ECCV 2012. Springer, pp 730–743
Johnston R (2014) Least squares regression line. Springer, Dordrecht, pp 3526–3529
Google Scholar
Khan R, Barat C, Muselet D, Ducottet C et al (2012) Spatial orientations of visual word pairs to improve bag-of-visual-words model. In: Proceedings of the British machine vision conference
Kwitt R, Vasconcelos N, Rasiwasia N (2012) Scene recognition on the semantic manifold. In: Computer vision–ECCV 2012. Springer, pp 359–372
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE computer society conference on computer vision and pattern recognition, CVPR ’06. IEEE Computer Society, Washington, DC, USA, pp 2169–2178. https://doi.org/10.1109/CVPR.2006.68
Li C, Reiter A, Hager GD (2015) Beyond spatial pooling: fine-grained representation learning in multiple domains. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 4913–4922 . https://doi.org/10.1109/CVPR.2015.7299125
Li LJ, Fei-Fei L (2007) What, where and who? Classifying events by scene and object recognition. In: 2007 IEEE 11th international conference on computer vision, pp 1–8. https://doi.org/10.1109/ICCV.2007.4408872
Li LJ, Su H, Fei-Fei L, Xing EP (2010) Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: Advances in neural information processing systems, pp 1378–1386
Li LJ, Su H, Lim Y, Fei-Fei L (2012) Objects as attributes for scene classification. In: Trends and topics in computer vision. Springer, pp 57–69
Liu C, Yuen J, Torralba A, Sivic J, Freeman WT: Sift flow: dense correspondence across different scenes. In: Computer vision–ECCV 2008. Springer, pp 28–42 (2008)
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
Article MathSciNet Google Scholar
Mandelbrot B (1983) The fractal geometry of nature. Einaudi paperbacks. Henry Holt and Company. https://books.google.co.uk/books?id=0R2LkE3N7-oC
Novianto S, Suzuki Y, Maeda J (2003) Near optimum estimation of local fractal dimension for image segmentation. Pattern Recognit Lett 24(1–3):365–374. https://doi.org/10.1016/S0167-8655(02)00261-1
Article Google Scholar
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175. https://doi.org/10.1023/A:1011139631724
Article MATH Google Scholar
Pandey M, Lazebnik S (2011) Scene recognition and weakly supervised object localization with deformable part-based models. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 1307–1314
Parizi SN, Oberlin JG, Felzenszwalb PF (2012) Reconfigurable models for scene recognition. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2775–2782
Peitgen HO, Jürgens H, Saupe D (1992) Chaos and fractals: new frontiers of science. Springer, Berlin
Book MATH Google Scholar
Sarkar N, Chaudhuri BB (1994) An efficient differential box-counting approach to compute fractal dimension of image. IEEE Trans Syst Man Cybern 24(1):115–120. https://doi.org/10.1109/21.259692
Article Google Scholar
Shabou A, LeBorgne H (2012) Locality-constrained and spatially regularized coding for scene categorization. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3618–3625
Torresani L, Szummer M, Fitzgibbon A (2010) Efficient object category recognition using classemes. In: Computer vision–ECCV 2010. Springer, pp 776–789
Tsai CF (2012) Bag-of-words representation in image annotation: A review. ISRN Artificial Intelligence 2012
Vailaya A, Figueiredo MAT, Jain AK, Zhang HJ (2001) Image classification for content-based indexing. IEEE Trans Image Process 10(1):117–130. https://doi.org/10.1109/83.892448
Article MATH Google Scholar
van Gemert JC, Veenman CJ, Smeulders AWM, Geusebroek JM (2010) Visual word ambiguity. IEEE Trans Pattern Anal Mach Intell 32(7):1271–1283. https://doi.org/10.1109/TPAMI.2009.132
Article Google Scholar
Varma M, Garg R (2007) Locally invariant fractal features for statistical texture classification. In: 2007 IEEE 11th international conference on computer vision, pp 1–8 (2007). https://doi.org/10.1109/ICCV.2007.4408876
Wu J, Rehg JM (2011) Centrist: a visual descriptor for scene categorization. IEEE Trans Pattern Anal Mach Intell 33(8):1489–1501
Article Google Scholar
Xu S, Weng Y (2006) A new approach to estimate fractal dimensions of corrosion images. Pattern Recognit Lett 27(16):1942–1947. https://doi.org/10.1016/j.patrec.2006.05.005
Article Google Scholar
Xu Y, Huang S, Ji H, Fermuller C (2009) Combining powerful local and global statistics for texture description. In: IEEE conference on computer vision and pattern recognition, pp 573–580. https://doi.org/10.1109/CVPR.2009.5206741
Zhang E, Mayo M (2010) Improving bag-of-words model with spatial information. In: 2010 25th international conference of image and vision computing New Zealand (IVCNZ), pp 1–8. https://doi.org/10.1109/IVCNZ.2010.6148795
Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495
Zhu J, Li LJ, Fei-Fei L, Xing EP (2010) Large margin learning of upstream scene understanding models. In: Advances in neural information processing systems, pp 2586–2594

Download references

Acknowledgements

This work was supported by the FUNDECT—State of Mato Grosso do Sul Foundation to Support Education, Science and Technology, CAPES—Brazilian Federal Agency for Support and Evaluation of Graduate Education, and CNPq—National Council for Scientific and Technological Development. The Titan X Pascal used for this research was donated by the NVIDIA Corporation. Lucas Correia Ribas gratefully acknowledges the financial support grant #2016/23763-8, São Paulo Research Foundation (FAPESP). Odemir M. Bruno thanks the financial support of CNPq (Grant # 307797/2014-7) and FAPESP (Grant #s 14/08026-1 and 16/18809-9).

Author information

Authors and Affiliations

Institute of Mathematics and Computer Science, University of São Paulo, Avenida Trabalhador São Carlense, 400, São Carlos, SP, 13566-590, Brazil
Lucas Correia Ribas
Federal University of Mato Grosso do Sul, Rua Itibiré Vieira, s/n, Residencial Julia Oliveira Cardinal, Ponta Porã, MS, 79907-414, Brazil
Diogo Nunes Gonçalves, Jonathan de Andrade Silva, Amaury Antônio de Castro Jr. & Wesley Nunes Gonçalves
São Carlos Institute of Physics, University of São Paulo, Avenida Trabalhador São Carlense, 400, São Carlos, SP, PO Box 369, 13560-970, Brazil
Odemir Martinez Bruno

Authors

Lucas Correia Ribas
View author publications
You can also search for this author in PubMed Google Scholar
Diogo Nunes Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan de Andrade Silva
View author publications
You can also search for this author in PubMed Google Scholar
Amaury Antônio de Castro Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Odemir Martinez Bruno
View author publications
You can also search for this author in PubMed Google Scholar
Wesley Nunes Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wesley Nunes Gonçalves.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ribas, L.C., Gonçalves, D.N., de Andrade Silva, J. et al. Fractal dimension of bag-of-visual words. Pattern Anal Applic 22, 89–98 (2019). https://doi.org/10.1007/s10044-018-0736-x

Download citation

Received: 15 June 2017
Accepted: 19 July 2018
Published: 27 July 2018
Issue Date: 05 February 2019
DOI: https://doi.org/10.1007/s10044-018-0736-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improvement the Bag of Words Image Representation Using Spatial Information

Discriminative Image Representation for Classification

Scale-space multi-view bag of words for scene categorization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fractal dimension of bag-of-visual words

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improvement the Bag of Words Image Representation Using Spatial Information

Discriminative Image Representation for Classification

Scale-space multi-view bag of words for scene categorization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation