Abstract
Artificial neural network (ANN) is one of the technologies used for emerging real-world problems. Activation functions (AF) are used in deep learning architectures to make decisions in the hidden and output layers. An AF influences the dynamics of training and performance of an ANN. This paper proposes a novel AF called E-Tanh by extending the Tanh AF for ANN models. When used in three shallow ANN models with 2, 4, and 6 hidden layers with the Modified National Institute of Standards and Technology database (MNIST), it gave a maximum accuracy of 99.5% for handwritten digits classification than the results produced by the existing AFs. The proposed AF E-Tanh outperformed other existing and well-known AFs ReLU, Sigmoid, Tanh, Swish, and E-Swish. In deep ANNs with 5–50 hidden layers, the proposed AF’s prediction accuracy of up to 25 hidden layers was higher than the existing AFs on the MNIST database. When used in shallow CNN for the CIFAR10 database, it outperformed other AFs. When used in wide residual network, it performed poorly in classifying images in CIFAR10 and CIFAR100 database but performed competitively with all other AFs in CNN models on the MNIST dataset.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Acharya UR, Fujita H, Oh SL, Hagiwara Y, Tan JH, Adam M, San Tan R (2019) Deep convolutional neural network for the automated diagnosis of congestive heart failure using ECG signals. Appl Intell 49(1):16–27
Manavazhahan M (2017). A study of activation functions for neural networks.
Jones N (2014) Computer science: the learning machines. Nature News 505(7482):146
Li W, Meng P, Hong Y, Cui X (2020) Using deep learning to preserve data confidentiality. Appl Intell 50(2):341–353
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press.
Jarrett K, Kavukcuoglu K, Ranzato MA, LeCun Y (2009) What is the best multi-stage architecture for object recognition?. In 2009 IEEE 12th international conference on computer vision (pp. 2146-2153). IEEE
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In Proceedings of the 14th international conference on artificial intelligence and statistics (pp. 315-323)
Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In Proc. icml (Vol. 30, No. 1, p. 3)
Konda K, Memisevic R, Krueger D (2014) Zero-bias autoencoders and the benefits of co-adapting features. arXiv preprint arXiv:1402.3337
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034)
Dugas C, Bengio Y, Bélisle F, Nadeau C, Garcia R (2001) Incorporating second-order functional knowledge for better option pricing. In Advances in neural information processing systems (pp. 472-478).
Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. arXiv preprint arXiv:1302.4389
Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289
Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. In Advances in neural information processing systems (pp. 971-980)
Ramachandran P, Zoph B, Le QV (2017) Swish: a self-rated activation function. arXiv preprint arXiv:1710.05941, 7
Alcaide E (2018) E-Swish: Adjusting activations to different network depths. arXiv preprint arXiv:1801.07145
William EID, Dinc I (2019) LRTanH: Substitution for the activation function derivative during back propagation. In 2019 SoutheastCon (pp. 1-6). IEEE
Roy SK, Manna S, Dubey SR, Chaudhuri BB (2019) LiSHT: Non-parametric linearly scaled hyperbolic tangent activation function for neural networks. arXiv preprint arXiv:1901.05894
Noor M, Salwa S, Ren J, Marshall S, Michael K (2017) Hyperspectral image enhancement and mixture deep-learning classification of corneal epithelium injuries. Sensors 17(11):2644
Sainath TN, Kingsbury B, Saon G, Soltau H, Mohamed AR, Dahl G, Ramabhadran B (2015) Deep convolutional neural networks for large-scale speech tasks. Neural Netw 64:39–48
Agarwalla S, Sarma KK (2016) Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech. Neural Netw 78:97–111
Badrinarayanan V, Kendall A, Cipolla R (2017) Signet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Li X, Gong Z, Yin H, Zhang H, Wang Z, Zhuo L (2020) A 3d deep supervised densely network for small organs of human temporal bone segmentation in ct images. Neural Netw 124:75–85
Rashed EA, Gomez-Tames J, Hirata A (2020) End-to-end semantic segmentation of personalized deep brain structures for non-invasive brain stimulation. Neural Networks 125:233–244
Pinheiro PH, Collobert R (2014) Recurrent convolutional neural networks for scene labeling. In: 31st international conference on machine learning (ICML) (No. CONF)
Vinyals O, Kaiser Ł, Koo T, Petrov S, Sutskever I, Hinton G (2015) Grammar as a foreign language. In Advances in neural information processing systems (pp. 2773-2781)
Liu Y, Zhang J (2018) Deep Learning in machine translation. In deep learning in natural language processing (pp. 147-183). Springer, Singapore
Arik SÖ, Chrzanowski M, Coates A, Diamos G, Gibiansky A, Kang Y, Sengupta S (2017). Deep voice: Real-time neural text-to-speech. In Proceedings of the 34th international conference on machine learning 70: 195-204). JMLR. org
Ping W, Peng K, Gibiansky A, Arik SO, Kannan A, Narang S, Miller J (2017) Deep Voice 3: Scaling text-to-speech with convolutional sequence learning. arXiv preprint arXiv:1710.07654
Albarqouni S, Baur C, Achilles F, Belagiannis V, Demirci S, Navab N (2016) Agent: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans Med Imaging 35(5):1313–1321
Wang D, Khosla A, Gargeya R, Irshad H, Beck AH (2016) Deep learning for identifying metastatic breast cancer. arXiv preprint arXiv:1606.05718
Cruz-Roa AA, Ovalle JEA, Madabhushi A, Osorio FAG (2013) A deep learning architecture for image representation, visual interpretability, and automated basal-cell carcinoma cancer detection. In international conference on medical image computing and computer-assisted intervention (pp. 403-410) Springer, Berlin, Heidelberg
Lazimul LT, Binoy DL (2017) Fingerprint liveness detection using convolutional neural network and fingerprint image enhancement. In 2017 international conference on energy, communication, data analytics, and soft computing (ICECDS) (pp. 731-735) IEEE
Jung HY, Heo YS (2018) Fingerprint liveness map construction using a convolutional neural network. Electron Lett 54(9):564–566
Grover A, Kapoor A, Horvitz E (2015) A deep hybrid model for weather forecasting. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 379-386)
Hossain M, Rekabdar B, Louis SJ, Dascalu S (2015) Forecasting the weather of Nevada: a deep learning approach. In 2015 international joint conference on neural networks (IJCNN) (pp. 1-6). IEEE
Uçar A, Demir Y, Güzeliş C (2017) Object recognition and detection with deep learning for autonomous driving applications. Simulation 93(9):759–769
Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE international conference on computer vision (pp. 2722-2730)
Nwankpa C, Ijomah W, Gachagan A, Marshall S (2018) Activation functions: Comparison of trends in practice and research for deep learning. arXiv preprint arXiv:1811.03378
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors of this article certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kalaiselvi, T., Padmapriya, S.T., Somasundaram, K. et al. E-Tanh: a novel activation function for image processing neural network models. Neural Comput & Applic 34, 16563–16575 (2022). https://doi.org/10.1007/s00521-022-07245-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07245-x