Abstract
Radiotherapy doses to some cardio-pulmonary substructures may be critical factors in the observed early mortality following radiotherapy for nonsmall cell lung cancer patients. Our goal is to provide an open-source tool to automatically segment cardio-vascular substructures for consistent outcomes analyses, and subsequently for radiation treatment planning of thoracic patients. Here, we built and validated a multi-label Deep Learning Segmentation (DLS) framework for accurate auto-segmentation of cardio-pulmonary substructures. The DLS framework utilized a deep neural network architecture to segment 12 cardio-pulmonary substructures from Computed Tomography (CT) scans of 217 patients previously treated with thoracic RT. The model was robust against variability in image quality characteristics, including the presence/absence of contrast. A hold-out dataset of additional 24 CT scans was used for quantitative evaluation of the final model against expert contours using Dice Similarity Coefficients (DSC) and 95th Percentile of Hausdorff Distance (HD95). DLS contours of an additional 10 CT scans were reviewed by a radiation oncologist to determine the number of slices in need of adjustment for each of the non-overlapping substructures. The DLS model reduced segmentation time per patient from about one hour of manual segmentation to 10 s. Quantitatively, the highest accuracy was observed for the Heart (median DSC = (0.96(\(0.91-0.93\))) and HD95 = (4.3 mm(3.8 mm − 5.5 mm)). The median DSC for the remaining structures was \(0.80-0.92\). The expert judged that, on average, 85% of the contours were equivalent to state-of-the-art manual contouring and did not require any modifications.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Various studies have shown that doses to some cardio-vascular substructures may be critical factors in the observed heart toxicity and early mortality following radiotherapy (RT) for nonsmall cell lung cancer (NSCLC) [10, 14,15,16]. This may be attributed to irradiation of particular constituents of the cardio-pulmonary system. Currently, segmentation of cardio-pulmonary organs other than the whole heart and lung has been overlooked, and only these two organs are routinely defined as part of the treatment planning process. RT planning requires robust and accurate segmentation of organs-at-risk in order to maximize radiation to the disease location and to spare the normal tissue as much as possible. The introduction of a new set of organs puts requirements on both segmentation accuracy and segmentation time that would result in an overhead of several hours of manual segmentation and contour refinement in the clinic.
We built and validated a multi-label Deep Learning Segmentation (DLS) framework for accurate auto-segmentation of cardio-pulmonary substructures. The DLS framework utilized a deep convolutional neural network architecture to segment 12 cardio-pulmonary substructures [4] from Computed Tomography (CT) scans of 217 patients previously treated with thoracic RT. The segmented substructures are: Heart, Pericardium, Atria, Ventricles, Aorta, Left Atrium (LA), Right Atrium (RA), Left Ventricle (LA), Right Ventricle (RV), Inferior Vena Cava (IVC), Superior Vena Cava (SVC) and Pulmonary Artery (PA). We evaluate our framework using a hold-out dataset of 24 CT scans by calculating quantitative as well as qualitative validation metrics. A radiation oncologist qualitatively evaluated auto-generated contours for an additional set of 10 CT scans to determine that, on average, 85% of the non-overlapping substructure contours required no modifications and were acceptable for clinical use.
2 Methodology
Our approach utilizes deep neural network for 2D segmentation of contrast as well as non-contrast enhanced thoracic CT images. The network auto-crops input CT scans around the lungs to extract the region of interest. The network is trained to perform multi-label prediction of eight non-overlapping, contiguous substructures, which are: aorta, LA, LV, RA, RV, IVC, SVC and PA. Additionally it is individually trained to segment the overlapping structures such as the heart, the atria, pericardium and ventricles. Output label predictions for the multi-label segmentation network and overlapping structures were combined for each input scan, resulting in auto-segmentation of 12 cardio-pulmonary substructures.
2.1 Experimental Datasets
Experimental data consisted of CT scans of 241 patients obtained from our institutional clinic. This data consisted of contrast as well as non-contrast enhanced images of varying imaging quality and resolution across different scanners. Manual expert segmentation for 12 organs-at-risk cardio-pulmonary structures was considered ground truth and used for model training, testing and validation. 192 CT scans were utilized for model training, 24 CT scans were used for model testing and the remaining 24 CT scans were used for hold-out validation respectively. These scans were auto-cropped around the lungs to extract the volume of interest around the heart substructures. 2D axial slices pertaining to each patient image volume were resized to \(512\times 512\) and normalized, resulting in a total of 10,284 training images. Network input data was augmented per batch and consisted of random cropping, random horizontal and vertical flipping and rotation by ten degrees. Resulting auto-segmented 2D axial images were stacked back together to generate 3D segmentations without further post-processing.
An additional dataset of 10 RT planning thoracic CT scans, for which no expert contours were available, was used for qualitative contour evaluation by a radiologist to determine auto-generated contour acceptability for clinical use.
2.2 Network Architecture
Our approach, as depicted in Fig. 1 leveraged the deep neural network architecture of [1]. Convolutional neural networks (CNNs) and encoder-decoder neural networks have been successfully employed for medical image segmentation tasks [6, 7, 12, 13]. The Deeplab encoder-decoder network architecture with atrous separable convolutions consists of spatial pyramid pooling that encodes multi-scale contextual information to capture spatial anatomical information of contiguous structures. Dense feature maps extracted in the last encoder network path consist of detailed semantic information. The decoder network was able to robustly recover structure boundaries through bilinear upsampling at a factor of 4 while applying atrous convolutions to reduce features before semantic labeling. We trained the network using ResNet-101 [5] as the encoder network backbone with learning rate = 0.01 using “policy” learning rate scheduler [8], crop size = \(513\times 513\), batch size = 8, loss = cross-entropy, output stride = 16 for 50 epochs for dense label prediction. Our approach has been implemented using the Pytorch DL framework.
We also investigated the performance of various network loss functions and their influence on correct multi-label prediction. We trained our network with various segmentation losses on the same architecture backbone to account for varying structure sizes and class imbalance during training and determine the efficacy of modifying label prediction probabilities during back propagation for multi-label segmentation. The network was trained using cross entropy (CE), Multi-class Dice Loss (M-DSC), Generalized Dice Loss (G-DSC) [11] and a weighted combination of (0.5G-DSC + 0.5CE) which pixel-wise CE resulting in superior segmentation performance. Cross entropy loss can be described as
where X denotes the input images, \(p(t_i|x_i;\theta )\) is the pixel probability of the target class \(x_i\in \chi \) that is being predicted with network parameters \(\theta \). A quantitative comparison of auto-segmentation results using the aforementioned various network losses can be found in Table 3.
2.3 Model Evaluation
We quantitatively evaluated the auto-generated segmentations by comparing the Dice Coefficient (DSC) and 95th Percentile Hausdorff Distance (HD95 (mm)) of 24 patients against expert clinical segmentations. Additionally, an expert qualitatively evaluated the auto-generated multi-label segmentations for an additional cohort of 10 thoracic CT scans to validate the clinical usability of the auto-contours. No expert contours were present for this additional validation dataset. The expert reviewed substructure contours on axial slices of the CT images and rated them on a four-grade score: Good (requiring no adjustments), Acceptable (acceptable auto-contour deviations), Need of Adjustments (NOA) and Poor (requiring larger number of slice adjustments). Rating was performed by listing the number of slices requiring contour adjustments in relation to the average number of slices spanning each substructure. Criteria for the clinical contour scoring is presented in Table 1.
3 Results and Discussion
Table 2 compares the DSC evaluation metric for segmentations using the network training loss cross-entropy (CE) against other network training losses. Our experiments demonstrated that pixel-wise target class loss calculation using CE resulted in improved multi-label segmentation predictions when compared against Multi-class Dice Loss (M-DSC), Generalized Dice Loss (G-DSC) and a weighted combination of (0.5CE + 0.5G-DSC). Although the DSC score for Aorta, which is the largest substructure during multi-label segmentation, is improved as expected using the G-DSC loss, the accuracy of smaller, tubular substructures was reduced.
Figure 2 displays the DSC Score results for the 24 hold-out validation CT images for all 12 substructures segmented using the CE loss. Our achieved DSC accuracies are comparable to the state-of-the-art multi-atlas [9] and deep learning methods [3] for segmenting cardio-pulmonary substructures from CT images. The highest segmentation accuracy was observed for the heart (median DSC = 0.96, median HD95 = 3.48 mm), while the remaining structures achieving median accuracy (0.81 \(\le \) DSC \(\le \) 0.94) and (6 mm \(\le \) HD95 \(\le \) 3 mm), with highest HD95 surface distance accuracy observed for Aorta. Figure 3 display the qualitative contour results comparing the DLS contours against expert contours.
Table 3 displays the clinical contour evaluation scores of the auto-generated contours for 10 thoracic RT CT scans using the grading criteria described in Table 1. The expert identified all need for adjustments as minor modifications, with contours in acceptable ranges for the IVC, SVC, PA, LA and LV (median adjustments ranging between 5 to 15%). Most required adjustments were observed in the RV, with median 24% contours requiring modifications. Most of the minor adjustments were observed near the superior portion of the structures between CT image contour transitions. This may be attributed to the image artifacts introduced due to heart motion and image acquisition.
The qualitative scoring and quantitative evaluation for the aorta and IVC is lower than expected because both these substructure segmentations were extended on image slices beyond the clinical contouring protocol. According to the clinical contour guidelines, these two substructures should not be contoured beyond two slices below the last contoured image slice of the heart in the axial plane. However, due to lack of training on a large set of background CT images in the posterior portion of the heart contour during network training, our model continued to segment the aorta and IVC because of the presence of the substructure edges beyond the heart contour. This highlighted the consideration towards additional spatial input data requirements during network training for generating clinically acceptable auto-segmentations as input to radiation treatment planning.
4 Conclusion
We propose a model for auto-segmentation of cardio-pulmonary substructures from contrast and non-contrast enhanced CT images. The proposed model reduced substructure segmentation time for a new patient from about one hour of manual segmentation to approximately 10 s. We demonstrated that the model is robust against variability in image quality characteristics, including the presence or absence of contrast. We validated our approach by quantitatively comparing resulting contours against expert delineation. An expert concluded that overall 85% of the auto-generated contours are acceptable for clinical use without requiring adjustments. The resulting segmentations can effectively be utilized to study the effect of heart toxicity and clinical outcomes, as well as used as input to radiation therapy treatment planning. We have applied our approach to auto-segment an additional 283 treatment planning CT scans to study heart toxicity outcomes for lung cancer. The developed cardio-pulmonary segmentation models have being integrated into deep learning tools within the open-source CERR [2] platform.
References
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Deasy, J., Blanco, A., Clark, V.: CERR: a computational environment for radiotherapy research. Med. Phys. 30(5), 979–85 (2003)
Dormer, J.D., et al.: Heart chamber segmentation from CT using convolutional neural networks. In: Proceedings of SPIE - The International Society for Optical Engineering (2018)
Feng, M., Moran, J., Koelling, T., et al.: Development and validation of a heart atlas to study cardiac exposure to radiation following treatment for breast cancer. Int. J. Radiat. Oncol. Biol. Phys. 79(1), 10–18 (2010)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016. https://doi.org/10.1109/CVPR.2016.90
Isensee, F., et al.: nnU-Net: self-adapting framework for U-Net-based medical image segmentation. CoRR abs/1809.10486 (2018). http://arxiv.org/abs/1809.10486
Jin, Q., Meng, Z., Sun, C., Wei, L., Su, R.: RA-UNet: a hybrid deep attention-aware network to extract liver and tumor in CT scans. CoRR abs/1811.01328 (2018). http://arxiv.org/abs/1811.01328
Liu, W., Rabinovich, A., Berg, A.C.: ParseNet: looking wider to see better. CoRR abs/1506.04579 (2015). http://arxiv.org/abs/1506.04579
Luo, Y., et al.: Automatic segmentation of cardiac substructures from noncontrast CT images: accurate enough for dosimetric analysis? Acta Oncol. 58(1), 81–87 (2019)
McWilliam, A., Kennedy, J., et al.: Radiation dose to heart base linked with poorer survival in lung cancer patients. Eur. J. Cancer 85, 106–113 (2017)
Milletari, F., Navab, N., Ahmadi, S.: V-Net: fully convolutional neural networks for volumetric medical image segmentation. CoRR abs/1606.04797 (2016). http://arxiv.org/abs/1606.04797
Oktay, O., et al.: Anatomically constrained neural networks (ACNNs): application to cardiac image enhancement and segmentation. IEEE Trans. Med. Imaging 37(2), 384–395 (2018)
Oktay, O., et al.: Attention U-Net: learning where to look for the pancreas. CoRR abs/1804.03999 (2018). http://arxiv.org/abs/1804.03999
Dess, R.T., Sun, Y., et al.: Cardiac events after radiation therapy: combined analysis of prospective multicenter trials for locally advanced non-small-cell lung cancer. J. Clin. Oncol. 35, 1395–402 (2017)
Thor, M., Deasy, J., et al.: The role of heart-related dose-volume metrics on overall survival in the RTOG 0617 clinical trial. Int. J. Radiat. Oncol. Biol. Phys. 102, S96 (2018)
Vivekanandan, S., Landau, D., Counsell, N., Warren, D., Khwanda, A., et al.: The impact of cardiac radiation dosimetry on survival after radiation therapy for non-small cell lung cancer. Int. J. Radiat. Oncol. 99, 51–60 (2017)
Acknowledgments
This research is partially supported by NCI R01 CA198121.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Haq, R., Hotca, A., Apte, A., Rimner, A., Deasy, J.O., Thor, M. (2019). Cardio-Pulmonary Substructure Segmentation of CT Images Using Convolutional Neural Networks. In: Nguyen, D., Xing, L., Jiang, S. (eds) Artificial Intelligence in Radiation Therapy. AIRT 2019. Lecture Notes in Computer Science(), vol 11850. Springer, Cham. https://doi.org/10.1007/978-3-030-32486-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-32486-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32485-8
Online ISBN: 978-3-030-32486-5
eBook Packages: Computer ScienceComputer Science (R0)