Evolutionary Deep Attention Convolutional Neural Networks for 2D and 3D Medical Image Segmentation

738 Accesses
14 Citations
Explore all metrics

Abstract

Developing a convolutional neural network (CNN) for medical image segmentation is a complex task, especially when dealing with the limited number of available labelled medical images and computational resources. This task can be even more difficult if the aim is to develop a deep network and using a complicated structure like attention blocks. Because of various types of noises, artefacts and diversity in medical images, using complicated network structures like attention mechanism to improve the accuracy of segmentation is inevitable. Therefore, it is necessary to develop techniques to address the above difficulties. Neuroevolution is the combination of evolutionary computation and neural networks to establish a network automatically. However, Neuroevolution is computationally expensive, specifically to create 3D networks. In this paper, an automatic, efficient, accurate, and robust technique is introduced to develop deep attention convolutional neural networks utilising Neuroevolution for both 2D and 3D medical image segmentation. The proposed evolutionary technique can find a very good combination of six attention modules to recover spatial information from downsampling section and transfer them to the upsampling section of a U-Net-based network—six different CT and MRI datasets are employed to evaluate the proposed model for both 2D and 3D image segmentation. The obtained results are compared to state-of-the-art manual and automatic models, while our proposed model outperformed all of them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Neuroevolutionary Approach to the Design of Convolutional Neural Networks for 2D and 3D Medical Image Segmentation

Genetic Algorithm Enhanced nnU-Net for the MICCAI KiTS23 Challenge

Automatic Design of Deep Neural Networks Applied to Image Segmentation Problems

Data Availability

Datasets are publicly available

Code Availability

Not applicable

References

Abbas Q, Ibrahim ME, Jaffar MA: A comprehensive review of recent advances on deep vision systems. Artif Intell Rev 52(1):39–76, 2019
Article Google Scholar
Back T: Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms. Oxford university press, 1996
Bahdanau D, Cho K, Bengio Y: Neural machine translation by jointly learning to align and translate. arXiv preprint, 2014. arXiv:14090473
Baldeon-Calisto M, Lai-Yuen SK: Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation. Neurocomputing 392:325–340, 2020
Article Google Scholar
Calisto MB, Lai-Yuen SK: Adaen-net: An ensemble of adaptive 2d-3d fully convolutional networks for medical image segmentation. Neural Netw 2020
Chen P, Sun Z, Bing L, Yang W: Recurrent attention network on memory for aspect sentiment analysis. In: Proceedings of the 2017 conference on empirical methods in natural language processing, 2017, pp 452–461
Cheung B, Sable C: Hybrid evolution of convolutional networks. In 2011 10th International Conference on Machine Learning and Applications and Workshops, vol. 1, IEEE, 2011, pp. 293–297
Chollet F, et al: Keras. https://keras.io, 2015
Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O: 3d u-net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computer-assisted intervention, Springer, 2016, pp. 424–432
CireşAn D, Meier U, Masci J, Schmidhuber J: Multi-column deep neural network for traffic sign classification. Neural netw. 32:333–338, 2012
Article Google Scholar
Darwish A, Hassanien AE, Das S: A survey of swarm and evolutionary computing approaches for deep learning. Artif Intell Rev 53, 3:1767–1812, 2020
Article Google Scholar
Dice LR: Measures of the amount of ecologic association between species. Ecology 26, 3:297–302, 1945
Article Google Scholar
Dong N, Xu M, Liang X, Jiang Y, Dai W, Xing E: Neural architecture search for adversarial medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, 2019, pp. 828–836
Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C: The importance of skip connections in biomedical image segmentation. In Deep Learning and Data Labeling for Medical Applications. Springer, 2016, pp. 179–187
Fogel DB: Phenotypes, genotypes, and operators in evolutionary computation. In Proc. 1995 IEEE Int. Conf. Evolutionary Computation (ICEC 95), 1995, pp. 193–198
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H: Dual attention network for scene segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3146–3154
Fujino S, Mori N, Matsumoto K: Deep convolutional networks for human sketches by means of the evolutionary deep learning. In 2017 Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems (IFSA-SCIS), IEEE, 2017, pp. 1–5
Goldberg DE, Deb K: A comparative analysis of selection schemes used in genetic algorithms. In Foundations of genetic algorithms, vol. 1. Elsevier, 1991, pp. 69–93
Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS: Deep learning for visual understanding: A review. Neurocomputing 187:27–48, 2016
Article Google Scholar
Hassanzadeh T, Essam D, Sarker R: 2d to 3d evolutionary deep convolutional neural networks for medical image segmentation. IEEE Trans Med Imaging, 2020
Hassanzadeh T, Essam D, Sarker R: Evolutionary attention network for medical image segmentation. In 2020 The International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2020, pp. 1–8
Hassanzadeh T, Essam D, Sarker R: An evolutionary denseres deep convolutional neural network for medical image segmentation. IEEE Access, 2020
Hassanzadeh T, Essam D, Sarker R: Evou-net: an evolutionary deep fully convolutional neural network for medical image segmentation. In Proceedings of the 35th Annual ACM Symposium on Applied Computing, 2020, pp. 181–189
He K, Zhang X, Ren S, Sun J: Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 2016, pp. 770–778
Heimann T, Van Ginneken B, Styner MA, Arzhaeva Y, Aurich V, Bauer C, Beck A, Becker C, Beichel R, Bekes G, et al: Comparison and evaluation of methods for liver segmentation from ct datasets. IEEE Trans Med Imaging 28, 8:1251–1265, 2009
Article Google Scholar
Hochreiter S, Schmidhuber J: Long short-term memory. Neural Comput 9, 8:1735–1780, 1997
Article CAS Google Scholar
Holland JH, et al: Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press, 1992
Hu J, Shen L, Sun G: Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ: Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition 2017, pp. 4700–4708
Khan A, Sohail A, Zahoora U, Qureshi AS: A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53, 8:5455–5516, 2020
Article Google Scholar
Kolařík M, Burget R, Uher V, Říha K, Dutta MK: Optimized high resolution 3d dense-u-net network for brain and spine segmentation. Appl Sci 9, 3:404, 2019
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE: Imagenet classification with deep convolutional neural networks. Commun ACM 60, 6:84–90, 2017
Article Google Scholar
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD: Backpropagation applied to handwritten zip code recognition. Neural Comput 1, 4:541–551, 1989
Article Google Scholar
Li H, Xiong P, An J, Wang L: Pyramid attention network for semantic segmentation. arXiv preprint, 2018. arXiv:1805.10180
Li Y, Hao Z, Lei H: Survey of convolutional neural network. J Comput Appl 36, 9:2508–2515, 2016
Google Scholar
Li Y, Zhu Z, Kong D, Han H, Zhao Y: Ea-lstm: Evolutionary attention-based lstm for time series prediction. Knowledge-Based Systems 181:104785, 2019
Article Google Scholar
Liu X, Deng Z, Yang Y: Recent progress in semantic image segmentation. Artif Intell Rev 52, 2:1089–1106, 2019
Article Google Scholar
Mane D, Kulkarni UV: A survey on supervised convolutional neural network and its major applications. In Deep Learning and Neural Networks: Concepts, Methodologies, Tools, and Applications. IGI Global, 2020, pp. 1058–1071
Mortazi A, Bagci U: Automatically designing cnn architectures for medical image segmentation. In International Workshop on Machine Learning in Medical Imaging, Springer, 2018, pp. 98–106
Qin Z, Yu F, Liu C, Chen X: How convolutional neural network see the world-a survey of convolutional neural network visualization methods. arXiv preprint, 2018. arXiv:1804.11191
Real E, Aggarwal A, Huang Y, Le QV: Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, 2019, vol. 33, pp. 4780–4789
Article Google Scholar
Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le Q, Kurakin A: Large-scale evolution of image classifiers. arXiv preprint, 2017. arXiv:1703.01041
Ronneberger O, Fischer P, Brox T: U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, Springer, 2015, pp. 234–241
Schlemper J, Oktay O, Schaap M, Heinrich M, Kainz B, Glocker B, Rueckert D: Attention gated networks: Learning to leverage salient regions in medical images. Med Image Anal 53, 197–207, 2019
Article Google Scholar
Shen T, Zhou T, Long G, Jiang J, Pan S, Zhang C: Disan: Directional self-attention network for rnn/cnn-free language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018, vol. 32
Simpson AL, Antonelli M, Bakas S, Bilello M, Farahani K, Van Ginneken B, Kopp-Schneider A, Landman BA, Litjens G, Menze B, et al: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv preprint, 2019. arXiv:1902.09063
Stanley KO, Miikkulainen R: Evolving neural networks through augmenting topologies. Evol Comput 10, 2:99–127, 2002
Article Google Scholar
Tian Y, Zhang Y, Zhou D, Cheng G, Chen WG, Wang R: Triple attention network for video segmentation. Neurocomputing 417:202–211, 2020
Article Google Scholar
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X: Residual attention network for image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 3156–3164
Weng Y, Zhou T, Li Y, Qiu X: Nas-unet: Neural architecture search for medical image segmentation. IEEE Access 7:44247–44257, 2019
Article Google Scholar
Yu F, Koltun V: Multi-scale context aggregation by dilated convolutions. arXiv preprint, 2015. arXiv:1511.07122
Yu L, Yang X, Chen H, Qin J, Heng PA: Volumetric convnets with mixed residual connections for automated prostate segmentation from 3d mr images. In Thirty-first AAAI conference on artificial intelligence, 2017
Zhang H, Jin Y, Cheng R, Hao K: Efficient evolutionary search of attention convolutional networks via sampled training and node inheritance. IEEE Trans Evol Comput, 2020
Zoph B, Le QV: Neural architecture search with reinforcement learning. arXiv preprint, 2016. arXiv:1611.01578
Zoph B, Vasudevan V, Shlens J, Le QV: Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8697–8710

Download references

Funding

Not applicable

Author information

Authors and Affiliations

University of New South Wales, Canberra, Australia
Tahereh Hassanzadeh, Daryl Essam & Ruhul Sarker

Authors

Tahereh Hassanzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Daryl Essam
View author publications
You can also search for this author in PubMed Google Scholar
Ruhul Sarker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tahereh Hassanzadeh.

Ethics declarations

Ethics Approval

Not applicable

Consent to Participate

Not applicable

Consent for Publication

Not applicable

Conflict of Interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix Extra Evaluation

2D Versus 3D

In this section, the proposed evolutionary 2D attention model is compared versus the proposed evolutionary 3D attention model.

Time Comparison

In this work, one NVIDIA GPU was used for training the 2D model and two NVIDIA GPUs were used for training the proposed 3D model. Figure 5 illustrates the required training time for 2D and 3D models. For example, our proposed model needs to be trained for about 24 days to find a set of 3D networks for 3D heart segmentation; however, the 2D model needs less than four days for training. Figure 5 shows a considerable difference between 2D and 3D models regarding the required computation. However, our 3D evolutionary model is still feasible to be implemented using a limited number of GPUs.

Parameter Comparison

In this section, the best obtained 2D attention network, its corresponding converted 3D network, and the evolutionary 3D attention network are compared regarding the number of trainable parameters. As shown in Fig. 6, the obtained 2D networks use less than a million parameters; however, with converting the 2D operations to 3D, the number of parameters are increased about five to six times. The final evolutionary attention 3D networks utilised two to less than nine million parameters, which can be considered a relatively small network for 3D image segmentation.

Structure Comparison

Table 9 provides the genotypes of the best-found networks regarding each dataset in 2D and 3D. As shown in Table 9, each network has its own structural and training parameters. Table 9 shows the diversity of the found networks and the input data’s effects on the final network. According to the paper’s approach, each of these genotypes can be converted to their corresponding networks or phenotypes. It needs to be noted that each network was evolved with its own training parameters. For example, the best-found 2D network for the Sliver dataset is trained using Rmsprop optimiser with a learning rate of 0.001; the batch size is 8, augmentation size is 32000, and the “he-uniform” is used for the weight initialisation.

Table 9 The chromosomes of the best founded 2D and 3D networks using six various datasets

Full size table

Attention Modules

A comparison of the best five 2D and 3D networks utilising attention modules is provided in this section.

Figure 7 represents the distribution of utilised attention modules in the five best 2D networks regarding each dataset. As shown in Fig. 7, in some cases like the Sliver datasets, the best five networks utilised all modules in their structures; however, in some of them such as Prostate dataset five different types of attention modules are used in the five best networks. A similar pattern can be seen in the 3D networks (see Fig. 8). As shown in Fig. 8, the best 3D attention networks of Liver dataset only used residual activation unit and squeeze and excitation.

Based on the input data, and the obtained DSCs during evolution, the best combination of attention modules was selected for each network, which is a very difficult task if we design a network manually.

Extra Evaluation

Cross-Validation

To show our proposed 3D attention model’s robustness and remove randomness in our experiments, fourfold cross-validation was applied on the Sliver dataset. The Sliver dataset is one of the smallest datasets that we utilised in this work. Therefore it can be a good candidate for cross-validation. Table 10 shows the number of volumes in train, test, and validation sets for fourfold cross-validation. The number of volumes is shown as \(N\times M\times X\times Y\), where N is the number of volumes, M is the number of slices in a volume, X is width, and M is height of slices in a volume.

Table 10 The number of volumes in train, test, and validation sets for fourfold cross-validation using Sliver dataset

Full size table

For each fold, evolution started with 30 population and continued up to nine generations using 15 population. During the evolution, the validation set is used for evaluation of the networks. In the end, the five best networks are selected, and the obtained DSCs using test sets are reported in Table 11.

Table 11 The obtained DSCs of the five best networks in each fold

Full size table

As can be seen from Table 11, the proposed evolutionary attention model could obtain high accuracy 3D networks for 3D medical image segmentation regarding each fold.

Effect of Attention Modules

Also, to show the effect of using attention modules to recover and transfer extracted feature maps, examples of extracted features before and after residual activation unit and attention residual module are presented in Fig. 9. As shown in Fig. 9, the first row shows two examples of Heart and Hippocampus images along with their corresponding ground-truth. Figure 9b indicates a number of input feature maps to the residual activation unit, and Fig. 9c represents the module’s output feature maps. As can be seen from Fig. 9, after using the attention modules, a part of information about region of interest (RoI) is recovered. Also, a similar pattern can be seen for a Hippocampus image.

Subjective Comparison

In this section, another example of the segmented images regarding each dataset is provided as a subjective comparison. The results are compared versus six previous works, including Converted 2D to 3D [20], 3D U-Net [9], ConvNet [52], 3D Dense U-Net, UNet attention [44] [31], and NAS U-Net [50] respectively. NAS U-Net [50] is an automatic reinforcement-based technique to develop a network. Also, Converted 2D to 3D [20] is an automatic evolutionary model to develop networks and the rest are manually designed networks. As shown in Fig. 10, our proposed model could predict the ROI with high accuracy; however, over or under segmentation can be seen in some of the previous work’s results. For example, ConvNet even could not predict the ROI from the Hippocampus image. Because these networks are developed for a specific application or dataset, the networks structure or their training parameters need to be tuned accordingly when changing the application or dataset. It needs to be noted that all the previous works are implemented and trained based on their source papers.

Example of Crossover and Mutation

To clarify the crossover and mutation in our proposed model, an example is provided in Fig. 11. As shown in Fig. 11, two chromosomes are selected as parents, applying one-point crossover, two new offspring generated. Besides, three random mutations happened in one of the children and one mutation to the other one. This procedure needs to be repeated to create a new generation. After training the proposed model to the stated generation, a number of best networks will be selected as the final best networks.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hassanzadeh, T., Essam, D. & Sarker, R. Evolutionary Deep Attention Convolutional Neural Networks for 2D and 3D Medical Image Segmentation. J Digit Imaging 34, 1387–1404 (2021). https://doi.org/10.1007/s10278-021-00526-2

Download citation

Received: 08 July 2021
Revised: 20 September 2021
Accepted: 05 October 2021
Published: 02 November 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s10278-021-00526-2

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Hybrid Neuroevolutionary Approach to the Design of Convolutional Neural Networks for 2D and 3D Medical Image Segmentation

Genetic Algorithm Enhanced nnU-Net for the MICCAI KiTS23 Challenge

Automatic Design of Deep Neural Networks Applied to Image Segmentation Problems

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Conflict of Interest

Additional information

Publisher’s Note

Appendices

Appendix Extra Evaluation

2D Versus 3D

Time Comparison

Parameter Comparison

Structure Comparison

Attention Modules

Extra Evaluation

Cross-Validation

Effect of Attention Modules

Subjective Comparison

Example of Crossover and Mutation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation