Abstract
Convolutional neural networks (CNN) are highly effective for image classification and computer vision activities. The accuracy of CNN architecture depends on the design and selection of optimal parameters. The number of parameters increases exponentially with every connected layer in deep CNN architecture. Therefore, the manual selection of efficient parameters entirely remains ad-hoc. To solve that problem, we must carefully examine the relationship between the depth of architecture, input parameters, and the model’s accuracy. The evolutionary algorithms are prominent in solving the challenges in architecture design and parameter selection. However, the adoption of evolutionary algorithms itself is a challenging task as the computation cost increases with its evolution. The performance of evolutionary algorithms depends on the type of encoding technique used to represent a CNN architecture. In this article, we presented a comprehensive study of the recent approaches involved in the design and training of CNN architecture. The advantages and disadvantages of selecting a CNN architecture using evolutionary algorithms are discussed. The manual architecture is compared against automated CNN architecture based on the accuracy and range of parameters in the existing benchmark datasets. Furthermore, we have discussed the ongoing issues and challenges involved in evolutionary algorithms-based CNN architecture design.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Assunçao F, Lourenço N, Machado P, Ribeiro B (2018) Using GP is neat: evolving compositional pattern production functions. In: Proceedings of the European conference on genetic programming, pp 3–18
Bacanin N, Bezdan T, Tuba E, Strumberger I, Tuba M (2020) Optimizing convolutional neural network hyperparameters by enhanced swarm intelligence metaheuristics. Algorithms 13(3):67
Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint. arXiv:1611.02167
Bakhshi A, Noman N, Chen Z, Zamani M, Chalup S (2019) Fast automatic optimization of CNN architectures for image classification using genetic algorithm. In: 2019 IEEE congress on evolutionary computation (CEC), June 2019. IEEE, Piscataway, pp 1283–1290
Barik D, Mondal M (2010) Object identification for computer vision using image segmentation. In: 2010 2nd International conference on education technology and computer, vol 2. IEEE, Piscataway, pp V2-170
Bengio Y, LeCun Y (2007) Scaling learning algorithms towards AI. In: Bottou L, Chapelle O, DeCoste D, Weston J (eds) Large-scale kernel machines, vol 34(5). MIT, Cambridge, pp 1–41
Bergstra J, Yamins D, Cox D (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International conference on machine learning, pp 115–123
Beyer HG, Schwefel HP (2002) Evolution strategies—a comprehensive introduction. Nat Comput 1(1):3–52
Bock S, Goppold J, Weiß M (2018) An improvement of the convergence proof of the ADAM-Optimizer. arXiv preprint. arXiv:1804.10587
Cao J, Su Z, Yu L, Chang D, Li X, Ma Z (2018) Softmax cross entropy loss with unbiased decision boundary for image classification. In: 2018 Chinese automation congress (CAC), November 2018. IEEE, Piscataway, pp 2028–2032
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 1251–1258
Chunna L, Hai F, Chunlin G (2020) Development of an efficient global optimization method based on adaptive infilling for structure optimization. Struct Multidiscip Optim 62(6):3383–3412
Das K, Jiang J, Rao JNK (2004) Mean squared error of empirical predictor. Ann Stat 32(2):818–840
Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29(6):141–142
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255
Dorigo M, Birattari M, Di Caro GA, Doursat R, Engelbrecht AP, Floreano D, Gambardella LM, Gross R, Sahin E, Stützle T, Sayama H (2010) Swarm intelligence: 7th international conference, ANTS 2010, Brussels, Belgium
Elsken T, Metzen JH, Hutter F (2017) Simple and efficient architecture search for convolutional neural networks. arXiv preprint. arXiv:1711.04528
Esfahanian P, Akhavan M (2019) GACNN: training deep convolutional neural networks with genetic algorithm. arXiv preprint. arXiv:1909.13354.
Fan E (2000) Extended tanh-function method and its applications to nonlinear equations. Phys Lett A 277(4–5):212–218
Faradonbeh RS, Monjezi M, Armaghani DJ (2016) Genetic programing and non-linear multiple regression techniques to predict backbreak in blasting operation. Eng Comput 32(1):123–133
Fukushima K (1988) Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw 1(2):119–130
Galván E, Mooney P (2021) Neuroevolution in deep neural networks: current trends and future challenges. IEEE Trans Artif Intell 2(6):476–493
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Guo B, Hu J, Wu W, Peng Q, Wu F (2019) The Tabu_genetic algorithm: a novel method for hyper-parameter optimization of learning algorithms. Electronics 8(5):579
Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In: International workshop on artificial neural networks, June 1995. Springer, Berlin, pp 195–201
Hansen N, Arnold DV, Auger A (2015) Evolution strategies. In: Springer handbook of computational intelligence. Springer, Berlin, pp 871–898
Hassanzadeh T, Essam D, Sarker R (2020) EvoU-NET: an evolutionary deep fully convolutional neural network for medical image segmentation. In: Proceedings of the 35th annual ACM symposium on applied computing, March 2020, pp 181–189
Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int J Uncertain Fuzziness Knowl Based Syst 6(02):107–116
Houck CR, Joines J, Kay MG (1995) A genetic algorithm for function optimization: a Matlab implementation. NCSU-IE TR-95(09):1-0
Hu X, Eberhart RC, Shi Y (2003) Swarm intelligence for permutation optimization: a case study of n-queens problem. In: Proceedings of the 2003 IEEE swarm intelligence symposium, SIS’03 (Cat. No. 03EX706). IEEE, Piscataway, pp 243–246
Hu J, Shen L, Sun G (2018) Squeeze- and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 7132–7141
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 4700–4708
Istrate R, Scheidegger F, Mariani G, Nikolopoulos D, Bekas C, Malossi AC (2019) Tapas: train-less accuracy predictor for architecture search. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, No. 01, pp 3927–3934
Joshi D, Singh TP (2020) A survey of fracture detection techniques in bone X-ray images. Artif Intell Rev 53(6):4475–4517
Joshi D, Mishra V, Srivastav H, Goel D (2021) Progressive transfer learning approach for identifying the leaf type by optimizing network parameters. Neural Process Lett 53(5):3653–3676
Karaboga K (2005) An idea based on honey bee swarm for numerical optimization. Technical Report-TR06, Erciyes University, Engineering Faculty, Computer Engineering Department
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-international conference on neural networks, vol 4. IEEE, Piscataway, pp 1942–1948
Keskar NS, Socher R (2017) Improving generalization performance by switching from ADAM to SGD. arXiv preprint. arXiv:1712.07628
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516
Kitjacharoenchai P, Ventresca M, Moshref-Javadi M, Lee S, Tanchoco JM, Brunese PA (2019) Multiple traveling salesman problem with drones: mathematical model and heuristic approach. Comput Ind Eng 129:14–30
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT, Cambridge
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Kurbiel T, Khaleghian S (2017) Training of deep neural networks based on distance measures using RMSProp. arXiv preprint. arXiv:1708.01911
LeCun Y, Jackel LD, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, Vapnik V (1995) Learning algorithms for classification: a comparison on handwritten digit recognition. Neural Netw Stat Mech Perspect 261(276):2
Lee KS, Geem ZW (2005) A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice. Comput Methods Appl Mech Eng 194(36–38):3902–3933
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, Cham, pp 740–755
Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2017) Hierarchical representations for efficient architecture search. arXiv preprint. arXiv:1711.00436
Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li LJ, Fei-Fei L, Yuille A, Huang J, Murphy K (2018a) Progressive neural architecture search. In: Proceedings of the European conference on computer vision (ECCV), pp 19–34
Liu H, Simonyan K, Yang Y (2018b) Darts: differentiable architecture search. arXiv preprint. arXiv:1806.09055
Liu Y, Sun Y, Xue B, Zhang M, Yen GG, Tan KC (2021) A survey on evolutionary neural architecture search. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3100554
Liu S, Zhang H, Jin Y (2022) A survey on surrogate-assisted efficient neural architecture search. arXiv preprint. arXiv:2206.01520
Lorenzo PR, Nalepa J, Kawulok M, Ramos LS, Pastor JR (2017) Particle swarm optimization for hyper-parameter selection in deep neural networks. In: Proceedings of the genetic and evolutionary computation conference, pp 481–494
Loussaief S, Abdelkrim A (2018) Convolutional neural network hyper-parameters optimization based on genetic algorithms. Int J Adv Comput Sci Appl 9(10):252–266
Lucas S (2021) The origins of the halting problem. J Logical Algebraic Methods Program 121:100687
Lydia A, Francis S (2019) Adagrad—an optimizer for stochastic gradient descent. Int J Inf Comput Sci 6(5):566–568
Mendoza H, Klein A, Feurer M, Springenberg JT, Hutter F (2016) Towards automatically-tuned neural networks. In: Workshop on automatic machine learning, PMLR, pp 58–65
Michalewicz Z, Schoenauer M (1996) Evolutionary algorithms for constrained parameter optimization problems. Evol Comput 4(1):1–32
Mirjalili S (2019) Genetic algorithm. In: Evolutionary algorithms and neural networks. Springer, Cham, pp. 43–55
Muthukrishnan R, Radha M (2011) Edge detection techniques for image segmentation. Int J Comput Sci Inf Technol 3(6):259
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML
Pham H, Guan M, Zoph B, Le Q, Dean J (2018) Efficient neural architecture search via parameters sharing. In: International conference on machine learning, PMLR, pp 4095–4104
Qin AK, Huang VL, Suganthan PN (2008) Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Trans Evol Comput 13(2):398–417
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: Proceedings of the international conference on machine learning, pp 2902–2911
Ren J, Li Z, Yang J, Xu N, Yang T, Foran DJ (2019) Eigen Ecologically-inspired genetic approach for neural network structure searching from scratch. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9059–9068
Serizawa T, Fujita H (2020) Optimization of convolutional neural network using the linearly decreasing weight particle swarm optimization. arXiv preprint. arXiv:2001.05670
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556
Sinha T, Haidar A, Verma B (2018) Particle swarm optimization based approach for finding optimal values of convolutional neural network parameters. In: 2018 IEEE congress on evolutionary computation (CEC), pp 1–6
Sleegers J, Berg DVD (2021) Backtracking (the) algorithms on the Hamiltonian cycle problem. arXiv preprint. arXiv:2107.00314
Soon FC, Khaw HY, Chuah JH, Kanesan J (2018) Hyper-parameters optimisation of deep CNN architecture for vehicle logo recognition. IET Intell Transp Syst 12(8):939–946
Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 2:99–127
Suganuma M, Kobayashi M, Shirakawa S, Nagao T (2020) Evolution of deep convolutional neural networks using Cartesian genetic programming. Evol Comput 28(1):141–163
Sun Y, Xue B, Zhang M, Yen GG (2019a) Evolving deep convolutional neural networks for image classification. IEEE Trans Evol Comput 24(2):394–407
Sun Y, Xue B, Zhang M, Yen GG (2019b) Completely automated CNN architecture design based on blocks. IEEE Trans Neural Netw Learn Syst 31(4):1242–1254
Sun Y, Xue B, Zhang M, Yen GG, Lv J (2020) Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Trans Cybern 50(9):3840–3854
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 1–9
Talathi SS (2015) Hyper-parameter optimization of deep convolutional networks for object recognition. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 3982–3986
Vargas-Hakim GA, Mezura-Montes E, Acosta-Mesa HG (2021) A review on convolutional neural network encodings for neuroevolution. IEEE Trans Evol Comput 26(1):12–27
Voß S, Martello S, Osman IH, Roucairol C (2012) Meta-heuristics: advances and trends in local search paradigms for optimization. Springer, New York
Wang B, Sun Y, Xue B, Zhang M (2018a) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. In: Proceedings of the IEEE congress on evolutionary computation (CEC), pp 1–8
Wang B, Sun Y, Xue B, Zhang M (2018b) A hybrid differential evolution approach to designing deep convolutional neural networks for image classification. In: Australasian joint conference on artificial intelligence, December 2018b. Springer, Cham, pp 237–250
Wang Y, Zhang H, Zhang G (2019) cPSO-CNN: an efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks. Swarm Evol Comput 49:114–123
Wu H, Gu X (2015) Towards dropout training for convolutional neural networks. Neural Netw 71:1–10
Wu S, Zhong S, Liu Y (2018) Deep residual learning for image steganalysis. Multimedia Tools Appl 77(9):10437–10453
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint. arXiv:1708.07747
Xiao S, Li T, Wang J (2019) Optimization methods of video images processing for mobile object recognition. Multimedia Tools Appl 79(25):17245–17255
Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1379–1388
Yamasaki T, Honma T, Aizawa K (2017) Efficient optimization of convolutional neural networks using particle swarm optimization. In: Proceedings of the third international conference on multimedia big data (BigMM), pp 70–73
Yao X, Liu Y, Lin G (1999) Evolutionary programming made faster. IEEE Trans Evol Comput 3(2):82–102
Yu X, Gen M (2010) Introduction to evolutionary algorithms. Springer, London
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint. arXiv:1605.07146
Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv preprint. arXiv:1212.5701
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, Cham, pp 818–833
Zhan ZH, Li JY, Zhang J (2022) Evolutionary deep learning: a survey. Neurocomputing 483:42–58
Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Advances neural information processing systems 31 (NeurIPS 2018)
Zhong Z, Yan J, Liu C-L (2018) Practical network blocks design with q-learning. In: AAAI conference on artificial intelligence
Zhou X, Qin AK, Gong M, Tan KC (2021) A survey on evolutionary construction of deep neural networks. IEEE Trans Evol Comput 25(5):894–912
Zoph B, Le QV (2016) Neural architecture search with reinforcement learning. arXiv preprint. arXiv:1611.01578
Zoph B, Vasudevan V, Shlens J, Le QV (2018a) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8697–8710
Zunino R, Gastaldo P (2002) Analog implementation of the softmax function. In: 2002 IEEE international symposium on circuits and systems. Proceedings (Cat. No. 02CH37353), May 2002, vol 2. IEEE, Piscataway
Author information
Authors and Affiliations
Contributions
All authors take public responsibility for the content of the work submitted for review. The authors confirm contribution to the paper as follows: 1. VM—Conception or design of the work; 2. VM—Methodology; 3. VM—Writing—original draft; 4. VM—Critical revision of the article; 5. LK—Supervision; 6. LK and VM—final approval of the version to be communicated.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Mishra, V., Kane, L. A survey of designing convolutional neural network using evolutionary algorithms. Artif Intell Rev 56, 5095–5132 (2023). https://doi.org/10.1007/s10462-022-10303-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10303-4