[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

E-Tanh: a novel activation function for image processing neural network models

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Artificial neural network (ANN) is one of the technologies used for emerging real-world problems. Activation functions (AF) are used in deep learning architectures to make decisions in the hidden and output layers. An AF influences the dynamics of training and performance of an ANN. This paper proposes a novel AF called E-Tanh by extending the Tanh AF for ANN models. When used in three shallow ANN models with 2, 4, and 6 hidden layers with the Modified National Institute of Standards and Technology database (MNIST), it gave a maximum accuracy of 99.5% for handwritten digits classification than the results produced by the existing AFs. The proposed AF E-Tanh outperformed other existing and well-known AFs ReLU, Sigmoid, Tanh, Swish, and E-Swish. In deep ANNs with 5–50 hidden layers, the proposed AF’s prediction accuracy of up to 25 hidden layers was higher than the existing AFs on the MNIST database. When used in shallow CNN for the CIFAR10 database, it outperformed other AFs. When used in wide residual network, it performed poorly in classifying images in CIFAR10 and CIFAR100 database but performed competitively with all other AFs in CNN models on the MNIST dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Acharya UR, Fujita H, Oh SL, Hagiwara Y, Tan JH, Adam M, San Tan R (2019) Deep convolutional neural network for the automated diagnosis of congestive heart failure using ECG signals. Appl Intell 49(1):16–27

    Article  Google Scholar 

  2. Manavazhahan M (2017). A study of activation functions for neural networks.

  3. Jones N (2014) Computer science: the learning machines. Nature News 505(7482):146

    Article  Google Scholar 

  4. Li W, Meng P, Hong Y, Cui X (2020) Using deep learning to preserve data confidentiality. Appl Intell 50(2):341–353

    Article  Google Scholar 

  5. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press.

  6. Jarrett K, Kavukcuoglu K, Ranzato MA, LeCun Y (2009) What is the best multi-stage architecture for object recognition?. In 2009 IEEE 12th international conference on computer vision (pp. 2146-2153). IEEE

  7. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In Proceedings of the 14th international conference on artificial intelligence and statistics (pp. 315-323)

  8. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In Proc. icml (Vol. 30, No. 1, p. 3)

  9. Konda K, Memisevic R, Krueger D (2014) Zero-bias autoencoders and the benefits of co-adapting features. arXiv preprint arXiv:1402.3337

  10. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034)

  11. Dugas C, Bengio Y, Bélisle F, Nadeau C, Garcia R (2001) Incorporating second-order functional knowledge for better option pricing. In Advances in neural information processing systems (pp. 472-478).

  12. Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. arXiv preprint arXiv:1302.4389

  13. Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289

  14. Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. In Advances in neural information processing systems (pp. 971-980)

  15. Ramachandran P, Zoph B, Le QV (2017) Swish: a self-rated activation function. arXiv preprint arXiv:1710.059417

  16. Alcaide E (2018) E-Swish: Adjusting activations to different network depths. arXiv preprint arXiv:1801.07145

  17. William EID, Dinc I (2019) LRTanH: Substitution for the activation function derivative during back propagation. In 2019 SoutheastCon (pp. 1-6). IEEE

  18. Roy SK, Manna S, Dubey SR, Chaudhuri BB (2019) LiSHT: Non-parametric linearly scaled hyperbolic tangent activation function for neural networks. arXiv preprint arXiv:1901.05894

  19. Noor M, Salwa S, Ren J, Marshall S, Michael K (2017) Hyperspectral image enhancement and mixture deep-learning classification of corneal epithelium injuries. Sensors 17(11):2644

    Article  Google Scholar 

  20. Sainath TN, Kingsbury B, Saon G, Soltau H, Mohamed AR, Dahl G, Ramabhadran B (2015) Deep convolutional neural networks for large-scale speech tasks. Neural Netw 64:39–48

    Article  Google Scholar 

  21. Agarwalla S, Sarma KK (2016) Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech. Neural Netw 78:97–111

    Article  Google Scholar 

  22. Badrinarayanan V, Kendall A, Cipolla R (2017) Signet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  23. Li X, Gong Z, Yin H, Zhang H, Wang Z, Zhuo L (2020) A 3d deep supervised densely network for small organs of human temporal bone segmentation in ct images. Neural Netw 124:75–85

    Article  Google Scholar 

  24. Rashed EA, Gomez-Tames J, Hirata A (2020) End-to-end semantic segmentation of personalized deep brain structures for non-invasive brain stimulation. Neural Networks 125:233–244

    Article  Google Scholar 

  25. Pinheiro PH, Collobert R (2014) Recurrent convolutional neural networks for scene labeling. In: 31st international conference on machine learning (ICML) (No. CONF)

  26. Vinyals O, Kaiser Ł, Koo T, Petrov S, Sutskever I, Hinton G (2015) Grammar as a foreign language. In Advances in neural information processing systems (pp. 2773-2781)

  27. Liu Y, Zhang J (2018) Deep Learning in machine translation. In deep learning in natural language processing (pp. 147-183). Springer, Singapore

  28. Arik SÖ, Chrzanowski M, Coates A, Diamos G, Gibiansky A, Kang Y, Sengupta S (2017). Deep voice: Real-time neural text-to-speech. In Proceedings of the 34th international conference on machine learning 70: 195-204). JMLR. org

  29. Ping W, Peng K, Gibiansky A, Arik SO, Kannan A, Narang S, Miller J (2017) Deep Voice 3: Scaling text-to-speech with convolutional sequence learning. arXiv preprint arXiv:1710.07654

  30. Albarqouni S, Baur C, Achilles F, Belagiannis V, Demirci S, Navab N (2016) Agent: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans Med Imaging 35(5):1313–1321

    Article  Google Scholar 

  31. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH (2016) Deep learning for identifying metastatic breast cancer. arXiv preprint arXiv:1606.05718

  32. Cruz-Roa AA, Ovalle JEA, Madabhushi A, Osorio FAG (2013) A deep learning architecture for image representation, visual interpretability, and automated basal-cell carcinoma cancer detection. In international conference on medical image computing and computer-assisted intervention (pp. 403-410) Springer, Berlin, Heidelberg

  33. Lazimul LT, Binoy DL (2017) Fingerprint liveness detection using convolutional neural network and fingerprint image enhancement. In 2017 international conference on energy, communication, data analytics, and soft computing (ICECDS) (pp. 731-735) IEEE

  34. Jung HY, Heo YS (2018) Fingerprint liveness map construction using a convolutional neural network. Electron Lett 54(9):564–566

    Article  Google Scholar 

  35. Grover A, Kapoor A, Horvitz E (2015) A deep hybrid model for weather forecasting. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 379-386)

  36. Hossain M, Rekabdar B, Louis SJ, Dascalu S (2015) Forecasting the weather of Nevada: a deep learning approach. In 2015 international joint conference on neural networks (IJCNN) (pp. 1-6). IEEE

  37. Uçar A, Demir Y, Güzeliş C (2017) Object recognition and detection with deep learning for autonomous driving applications. Simulation 93(9):759–769

    Article  Google Scholar 

  38. Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE international conference on computer vision (pp. 2722-2730)

  39. Nwankpa C, Ijomah W, Gachagan A, Marshall S (2018) Activation functions: Comparison of trends in practice and research for deep learning. arXiv preprint arXiv:1811.03378

  40. http://yann.lecun.com/exdb/mnist/

  41. https://www.cs.toronto.edu/~kriz/cifar.html

  42. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  43. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Kalaiselvi.

Ethics declarations

Conflicts of interest

The authors of this article certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kalaiselvi, T., Padmapriya, S.T., Somasundaram, K. et al. E-Tanh: a novel activation function for image processing neural network models. Neural Comput & Applic 34, 16563–16575 (2022). https://doi.org/10.1007/s00521-022-07245-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07245-x

Keywords

Navigation