The Unreasonable Effectiveness of the Final Batch Normalization Layer

Veysel Kocaman ORCID: orcid.org/0000-0002-0065-6478¹⁷,
Ofer M. Shir¹⁸ &
Thomas Bäck¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13018))

Included in the following conference series:

International Symposium on Visual Computing

1041 Accesses

Abstract

Early-stage disease indications are rarely recorded in real-world domains, such as Agriculture and Healthcare, and yet, their accurate identification is critical in that point of time. In this type of highly imbalanced classification problems, which encompass complex features, deep learning (DL) is much needed because of its strong detection capabilities. At the same time, DL is observed in practice to favor majority over minority classes and consequently suffer from inaccurate detection of the targeted early-stage indications. In this work, we extend the study done by [11], showing that the final BN layer, when placed before the softmax output layer, has a considerable impact in highly imbalanced image classification problems as well as undermines the role of the softmax outputs as an uncertainty measure. This current study addresses additional hypotheses and reports on the following findings: (i) the performance gain after adding the final BN layer in highly imbalanced settings could still be achieved after removing this additional BN layer in inference; (ii) there is a certain threshold for the imbalance ratio upon which the progress gained by the final BN layer reaches its peak; (iii) the batch size also plays a role and affects the outcome of the final BN application; (iv) the impact of the BN application is also reproducible on other datasets and when utilizing much simpler neural architectures; (v) the reported BN effect occurs only per a single majority class and multiple minority classes – i.e., no improvements are evident when there are two majority classes; and finally, (vi) utilizing this BN layer with sigmoid activation has almost no impact when dealing with a strongly imbalanced image classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 55.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 69.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep learning ensemble approach with explainable AI for lung and colon cancer classification using advanced hyperparameter tuning

Article Open access 07 August 2024

Convolutional Neural Network-Based Classification of Histopathological Images Affected by Data Imbalance

Tens of images can suffice to train neural networks for malignant leukocyte detection

Article Open access 12 April 2021

References

Barz, B., Denzler, J.: Deep learning on small datasets without pre-training using cosine loss. In: The IEEE Winter Conference on Applications of Computer Vision, pp. 1371–1380 (2020)
Google Scholar
Beggel, L., Pfeiffer, M., Bischl, B.: Robust anomaly detection in images using adversarial autoencoders. arXiv preprint arXiv:1901.06355 (2019)
Bjorck, N., Gomes, C.P., Selman, B., Weinberger, K.Q.: Understanding batch normalization. In: Advances in Neural Information Processing Systems, pp. 7694–7705 (2018)
Google Scholar
Chelombiev, I., Houghton, C., O’Donnell, C.: Adaptive estimators show information compression in deep neural networks. arXiv preprint arXiv:1902.09037 (2019)
Cui, Y., Jia, M., Lin, T.Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9268–9277 (2019)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
Google Scholar
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1321–1330 (2017). JMLR. org
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hussain, M., Bird, J.J., Faria, D.R.: A study on CNN transfer learning for image classification. In: Lotfi, A., Bouchachia, H., Gegov, A., Langensiepen, C., McGinnity, M. (eds.) UKCI 2018. AISC, vol. 840, pp. 191–202. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97982-3_16
Chapter Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Kocaman, V., Shir, O.M., Bäck, T.: Improving model accuracy for imbalanced image classification tasks by adding a final batch normalization layer: An empirical study. arXiv preprint arXiv:2011.06319, Accepted to International Conference on Pattern Recognition, ICPR 2020. (2020)
Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems, pp. 950–957 (1992)
Google Scholar
Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)
Article MathSciNet Google Scholar
Mishkin, D., Matas, J.: All you need is a good init. arXiv preprint arXiv:1511.06422 (2015)
Mohanty, S.P., Hughes, D.P., Salathé, M.: Using deep learning for image-based plant disease detection. Front. Plant Sci. 7, 1419 (2016)
Article Google Scholar
Müller, R., Kornblith, S., Hinton, G.E.: When does label smoothing help? In: Advances in Neural Information Processing Systems, pp. 4696–4705 (2019)
Google Scholar
Santurkar, S., Tsipras, D., Ilyas, A., Madry, A.: How does batch normalization help optimization? In: Advances in Neural Information Processing Systems, pp. 2483–2493 (2018)
Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 60 (2019)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

Download references

Author information

Authors and Affiliations

LIACS, Leiden University, Leiden, The Netherlands
Veysel Kocaman & Thomas Bäck
Computer Science Department, Tel-Hai College and Migal Institute, Upper Galilee, Israel
Ofer M. Shir

Authors

Veysel Kocaman
View author publications
You can also search for this author in PubMed Google Scholar
Ofer M. Shir
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Bäck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Veysel Kocaman .

Editor information

Editors and Affiliations

University of Nevada, Reno, NV, USA
George Bebis
University of Texas at Arlington, Arlington, TX, USA
Vassilis Athitsos
University of South Carolina, Columbia, SC, USA
Tong Yan
City University of Hong Kong, Kowloon, Hong Kong
Manfred Lau
School of Engineering and Computing, University of Durham, Durham, Durham, UK
Frederick Li
Airbnb, New York, NY, USA
Conglei Shi
Peking University, Beijing, China
Xiaoru Yuan
Purdue University, West Lafayette, IN, USA
Christos Mousas
IST, School of Modeling, Simulation, and Training, Orlando, FL, USA
Gerd Bruder

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kocaman, V., Shir, O.M., Bäck, T. (2021). The Unreasonable Effectiveness of the Final Batch Normalization Layer. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2021. Lecture Notes in Computer Science(), vol 13018. Springer, Cham. https://doi.org/10.1007/978-3-030-90436-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-90436-4_7
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90435-7
Online ISBN: 978-3-030-90436-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Unreasonable Effectiveness of the Final Batch Normalization Layer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep learning ensemble approach with explainable AI for lung and colon cancer classification using advanced hyperparameter tuning

Convolutional Neural Network-Based Classification of Histopathological Images Affected by Data Imbalance

Tens of images can suffice to train neural networks for malignant leukocyte detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The Unreasonable Effectiveness of the Final Batch Normalization Layer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep learning ensemble approach with explainable AI for lung and colon cancer classification using advanced hyperparameter tuning

Convolutional Neural Network-Based Classification of Histopathological Images Affected by Data Imbalance

Tens of images can suffice to train neural networks for malignant leukocyte detection

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation