[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A Study on the Cardinality of Ordered Average Pooling in Visual Recognition

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10255))

Included in the following conference series:

Abstract

Bag-of-Words methods can be robust to image scaling, translation, and occlusion. An important step in this methodology, and other visual recognition systems like Convolutional Neural Networks, is spatial pooling, where the descriptors of neighbouring elements are combined into a local or a global feature vector. The combined vector must contain relevant information, while removing irrelevant and confusing details. Maximum and average are the most common aggregation functions used in the pooling step. In this work we present a study about the cardinality of ordered average pooling, i.e. the number of ordered elements to be aggregated such that after the pooling process the relevant information is maintained without degrading their discriminative power for classification. We provide an extensive evaluation that shows that for different values of cardinalities we can obtain results better than simple average pooling and than maximum pooling when dealing with small dictionary sizes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Maximum expectation is not used with triangle assignment coding due to its values are bigger than 1.

References

  1. Lowe, D.G.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  2. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005)

    Google Scholar 

  3. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  4. Koniusz, P., Yan, F., Mikolajczyk, K.: Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection. Comput. Vis. Image Underst. 117(5), 479–492 (2013)

    Article  Google Scholar 

  5. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178 (2006)

    Google Scholar 

  6. Van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1271–1283 (2010)

    Article  Google Scholar 

  7. Coates, A., Arbor, A., Ng, A.Y.: An analysis of single-layer networks in unsupervised feature learning. Aistats 2011, 215–223 (2011)

    Google Scholar 

  8. Wang, C., Huang, K.: How to use Bag-of-Words model better for image classification. Image Vis. Comput. 38, 65–74 (2015)

    Article  Google Scholar 

  9. Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2559–2566 (2010)

    Google Scholar 

  10. Boureau, Y.L., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in visual recognition. In: ICML, pp. 111–118 (2010)

    Google Scholar 

  11. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Conference on Computer Vision and Pattern Recognition Workshop (CVPR 2004), p. 178 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Pagola .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Pagola, M., Forcen, J.I., Barrenechea, E., Fernández, J., Bustince, H. (2017). A Study on the Cardinality of Ordered Average Pooling in Visual Recognition. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58838-4_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58837-7

  • Online ISBN: 978-3-319-58838-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics