Automatic segmentation of pulmonary lobes on low-dose computed tomography using deep learning
Introduction
Worldwide, lung cancer remains the leading cause of cancer incidence and mortality (1). The majority of patients with lung cancer are diagnosed at an advanced tumor stage, and only a small number of patients can be detected at the early tumor stage. The research of the International Early Lung Cancer Action Project (I-ELCAP) showed that annual low-dose computed tomography (LDCT) screening allows at least 85% of lung cancers to be diagnosed at stage I, and the estimated 10-year survival rate of early-stage (stage I) lung cancer is 88% (2). Early diagnosis and treatment of lung cancer play an important role in improving lung cancer survival and reducing lung cancer mortality.
As LDCT plays a more important role in lung cancer screening, which leads to an increase in the workload of radiologists in identifying and diagnosing lung nodules, and causes the burden of radiologists and the increase of misdiagnosed rate. Computer-assisted diagnosis (CAD) methods have value to the radiologists in the early detection and diagnosis of suspicious nodules on the LDCT.
Segmentation of pulmonary lobes from LDCT images is a crucially important step for CAD to identify pulmonary nodules easier. This step aims to eliminate the other structures of non-existent pulmonary nodules, such as the mediastinum, thoracic wall, heart, abdominal organs from the lung. In order to achieve this goal, many researchers have applied various kind of theories to the field of pulmonary lobe segmentation and put forward a number of pulmonary lobe segmentation algorithms. In recent years, conventional lobes segmentation algorithms were based on identifying the three major pulmonary fissures. Previous studies have used anatomical information, including the fixed shape and position of the airways, vessels, and fissures as prior knowledge to generate the final segmentation of pulmonary lobes (3,4). However, the anatomic information of airway, blood vessel, and the fissure is not always reliable, so these methods could not achieve the satisfactory pulmonary segmentation result for all cases.
With the development of computer technology, deep learning has become the most promising technology. From the earliest LeNet (5) to AlexNet (6), VggNet (7), GoogleNet (8), ResNet (9), and the recent DenseNet (10), the performance of CNN models getting stronger and more mature. CNN models were widely used in medical images, include lesion detection, qualitative diagnosis, automatic generation of structured report, lesion extraction, organ delineation for radiotherapy, and so on (11-15).
End-to-end trained fully convolutional networks (FCN) were previously designed to solve image segmentation problems in computer vision (16-17), and has recently been widely applied in medical image segmentation. Gibson et al. (18) proposed a novel deep learning-based algorithm named dense V-network (DenseVNet) to automatic segment multi-organ on abdominal CT. Imran et al. (19) combined the ideas of the DenseVNet and progressive holistically nested networks to obtain a new architecture for pulmonary lobe segmentation on routine chest CT.
Other studies have successfully applied CNN models to their lobe segmentation framework on routine chest CT. Gerard et al. (20) present a pulmonary lobes segmentation pipeline consisting of a series of 3D convolutional neural networks. Xie et al. (21) used a multi-resolution approach with two cascaded CNNs to capture both global context and local details for pulmonary lobes segmentation. With the decrease of radiation dose, the image noise of LDCT was obviously higher than that of routine chest CT. Therefore, pulmonary lobes segmentation on LDCT becomes a laborious task.
This study was aimed to verify the feasibility of pulmonary lobes segmentation in LDCT images using deep learning. We present the following article in accordance with the STARD reporting checklist (available at http://dx.doi.org/10.21037/atm-20-5060).
Methods
Datasets
The data used in the research were LDCT images that were acquired at the opportunistic lung cancer screening cohort in our hospital from October 2018 to February 2019. The LDCT dataset totally contained 160 cases (75 males, 85 females; median age was 56 years). Each LDCT cases was obtained using 64-detector row scanners (GE LightSpeed VCT, GE Medical Systems Optima CT660, and GE Medical Systems Discovery CT750) at full inspiration. CT parameters were as follows: tube voltage, 120 kVp; tube current, 30 mAs or 30–250 mA by using the automatic milliampere technology (GE Medical Systems Discovery CT750), noise index 40; pitch, 0.984; rotation time, 0.5 s; the thin-slice reconstruction thickness was 1.25 mm, and the interval was 0.8 mm using a standard reconstruction algorithm. The scanning range was from the apex of the lung to the diaphragm. LDCT images were saved as the standard DICOM format with a resolution of 512 (width) by 512 (height) pixels.
The LDCT dataset were randomly divided into the training set, the validation set, and the independent test set, which contained 100 cases, 10 cases, and 50 cases respectively. The validation set was used for hyperparameter tuning and detecting for the best model.
Reference standard segmentations
LDCT images were processed with an open-source platform for analyzing medical image data (3D Slicer, version 4.6) (22). The ground truth of pulmonary lobe segmentation was marked by a radiologist (Z Zhang), under the supervision of a senior radiologist (N Wu) with more than 30 years of experience in chest diagnosis. Normally, the right lung is composed of upper, middle, and lower lobe, and the left lung is composed of upper and lower lobe, as shown in Figure 1. The segmentation procedure included three mainly steps: lace a small number of fiducials on fissures, which were respectively left oblique fissure, right oblique fissure, and right horizontal fissure; select the 3D Gaussian filter with the smooth mode of filtering strength; and create the label map by the “Slow” method in order to achieve more accurate.
Before input to the network, all the images were preprocessing as follow: (I) resize matrix size to (256, 256, 256); (II) remove intensity value large than 1,000 or less than −1,000. The data augmentation method included random rotation within −10 to 10 degrees and scaling image size within 0.8 to 1.2 mm.
Deep learning-based segmentation network
A fully convolutional DenseVNet was used (18), implemented as part of NiftyNet (23), as shown in Figure 2. The architecture consists of a downsampling, upsampling, and skip connection components. The network used the dense feature stack for learning and inference, which improvs the processing speed and accuracy. The input to each convolutional layer of the stack was the concatenated outputs from all preceding layers of the stack. The main advantage of this structure is that it significantly reduced parameters and improved performance through gradient propagation and features reuse (10). In the downsampling path, strided convolutional layers reduce the dimensionality of the input feature map and connect dense feature stacks. Single convolutional layers in the skip connections, followed by bilinear upsampling, transform the output feature maps from each stage of the encoder path to the original image size. Each dense feature stack performed 3D convolution (3×3×3) with a learned kernel, used the rectified linear unit (ReLU) nonlinearity as the activation function (24), attached to a batch normalization (BN) (25) and dropout (26) with probability of 0.5. The final layer in the network is a 1×1 convolution followed by a softmax nonlinearity function to predict the class label at each pixel. To train the DenseVNet, a Dice-based loss function (27) was used at each stage of the network. The model was trained on 1,000 epochs, the initial learning rate was set as 0.001, batch size 1, and the Adam optimization method was chosen.
Evaluation metrics
Each pulmonary lobe segmentation was compared to the ground truth segmentation by three metrics, which were described as follows:
[1]
[2]
[3]
where A and B were the algorithm and ground truth segmentations, D (a, b) as the Euclidian distance between boundary pixels of A and B. The Dice coefficient and the Jaccard coefficient are overlap-based metrics methods, which were used for comparing similarities in the volumes between our algorithm and ground truth segmentations of the LDCT cases. The Hausdorff distance presents the maximum distance of a set to the nearest point in the other set (28), reflects agreement between segmentation boundaries. The smaller the Hausdorff distance, the more agreement between segmentation boundaries.
Statistical analysis
The performance of the proposed DenseVNet and the U-net model was evaluated using three critical evaluation metrics: the Dice coefficient, the Jaccard coefficient, and the Hausdorff distance. Evaluation metrics were summarized as average. In the test set, Independent sample t-test was employed to compare the proposed DenseVNet and the U-net model. A P value less than 0.05 was considered significant. Statistical analyses were performed using SPSS 22.0 (SPSS Inc., Chicago, IL, USA).
The Bland Altman method (29) was used to measure the agreement between our algorithm and ground truth of the LDCT cases.
Ethics statement
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This retrospective study was approved by the medical ethics committee of National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (NO.: 19-018/1840) and written informed consent was waived due to the retrospective nature of the study.
Results
Implementation
The network was implemented by the NiftyNet on the Tensorflow platform. Training and validation of our model were conducted on a local computing workstation, equipped with Intel Xeon CPU, NVIDIA Tesla V100 GPU, 64 GB of RAM. Training time by the proposed method was 20 hours. For reproducibility, we have shared our trained DenseVNet model config on GitHub (https://github.com/RainyRen/LungLobeSeg).
We selected the U-net model as a baseline model, which is a widely used network for biomedical image segmentation (30). The proposed DenseVNet model takes 3.1 seconds to segment lung lobes from one LDCT scan, which is approximately 2 times faster than the U-net model. The number of parameters in the DenseVNet model is only half that of the U-net model. Moreover, the DenseVNet model needed to allocate 6G graphics memory for model training, while the U-net model needed to allocate at least 14G graphics memory for model training.
Experimental results
The LDCT images were acquired from multi-detector computed tomography images. The in-plane spatial resolution was between 0.5 and 1 mm, and the reconstruction thickness was 1.25 mm. The thin-slice reconstruction is now common in clinical diagnosis, particularly for LDCT-based image analysis such as lung nodule detection.
To demonstrate the effectiveness, Figure 3 illustrated examples of segmentation result generated by our segmentation network. The original LDCT images (sagittal and coronal view) without and with overlaid segmentation were presented in parallel for comparison. The three-dimensional rendering of the segmentation result was also provided to give a more vivid understanding of pulmonary lobe segmentation. As demonstrated, the method had successfully separated the pulmonary lobes from the LDCT in all cases.
The performance of all three metrics (Dice coefficient, Jaccard coefficient, and Hausdorff distance) on the test set were summarized in Table 1. The proposed method achieved the all-lobes Dice coefficient, Jaccard coefficient, and Hausdorff distance were 0.944, 0.896, and 92.908 mm, respectively. The DenseVNet had a higher Dice coefficient and Jaccard coefficient than the U-net, all statistically significant except for the left upper lobe and the left lower lobe. The right middle lobe segmentation was the least accurate for the DenseVNet and the U-net. The Hausdorff distance among pulmonary lobes showed high variability. The DenseVNet had a lower Hausdorff distance significant than the U-net on the right upper lobe and the left lower lobe, but other pulmonary lobes segmentation of the DenseVNet were significantly higher than the U-net.
Full table
The result of the proposed segmentation model was very close to ground truth in every plot. The consistency of the left lobes was slightly inferior to that of the right lobes, as shown in Figure 4.
Discussion
LDCT imaging has become an efficient tool for the detection and following-up of pulmonary nodule. Combined with CAD as the second reader can significantly improve the sensitivity of radiologists to detect pulmonary nodules on CT scans, which can be used as the preferred assisted way for radiologists to detect pulmonary nodules (31). A CAD platform is needed to accurately detect pulmonary nodule and quantify the severity of nodule.
Previously proposed automatic techniques for pulmonary lobe segmentation are sensitive to anatomic and disease variations, reconstruction kernels, and parameter settings, making it frequently requires manual correction by experienced radiologists, which is a barrier to application in a clinical workflow. Deep learning has become a popular method in medical image analysis, which can be automatically, time-saving, and accurately applied to clinical work, so we focused on the deep learning method to segment the pulmonary lobes on LDCT.
FCN is a commonly used deep learning method in the field of computer vision which is proposed to solve image segmentation problems in both natural and medical image analysis. Conventional convolutional neural networks usually used a pooling layer to down-sample the input image. However, the pooling layer will reduce the resolution of output feature maps, resulting in a smaller output feature map. The role of the upsampling layer is to bring back the resolution to the resolution of the previous layer. FCN is achieved by training end-to-end pixel-wise prediction performed by replacing fully connected layers as a series of upsampling layers after convolutional layers, which can prevent segmentation result with a far lower resolution than the original images, and ensure the accuracy of segmentation. Our study applied the DenseVNet on the pulmonary lobes segmentation task and the model can achieve accurate segmentation on LDCT
LDCT images with greater image noise and streak artifacts than routine-dose CT, and some fine anatomical structures (such as airways, vessels, and fissures) as basis of pulmonary lobes segmentation become blurring or undetectable due to the partial volume effect. It is difficult to achieve pulmonary lobes segmentation on LDCT images. There are few studies extended their segmentation method applied to LDCT images.
The average Dice coefficient of the per-lobes and the all-lobes were above 0.92 on the test set, which means the segmentation of pulmonary lobes on LDCT using the DenseVNet was equally performed of the previous study (19) applied on standard-dose CT. This result indicates that DenseVNet can accurately solve the segmentation of pulmonary lobes on LDCT, without interactive manual correction. The automatically generated segmentation of pulmonary lobes has the potential to support disease assessment and treatment planning in lung cancer screening.
Anatomical variation or partial volume effects between cases, especially with the right horizontal fissure, is an obstacle to the segmentation of the right middle lobe (32), Our model obtained good results of the right middle lobe with a Dice coefficient of 0.923, Jaccard coefficient of 0.859, and Hausdorff distance of 62.455 mm on the test set. Due to the LDCT scan without electrocardiographically (ECG)-gated, the consistency of both left lobes was more affected by cardiac motion.
Like several other papers on medical image segmentation, the Dice coefficient was adopted as the main indicator of segmentation algorithm accuracy, and different metrics were used to evaluate the results (33,34). The Dice coefficient and Jaccard coefficient accounted for only the number of correctly or misclassified voxels without reflecting their spatial distribution, so we added the Hausdorff distance to indicate the boundary error of segmentation results. Retrospectively visual inspection of the results revealed that it may achieve satisfying accuracy over most of images. The Hausdorff distance was sensitive to outliers, which was determined solely by the largest distance of boundary error. A small piece of area was misclassified into pulmonary lobes in some test set cases, which may result in an obvious increase in the Hausdorff distance, as shown in Figure 5. We thought the misclassified area may be due to the Dice-based loss function was used on the proposed DenseVnet. When the loss function reached the minimum during the process of model training, the segmentation boundary has not fully converged.
There are some limitations to our work. The training set was collected in our hospital, which may not be representative of all lung cancer screening cases in other regions. The generalization capability of our model lacked further validation on multi-center datasets.
Conclusions
Deep learning can achieve the automatic segmentation of pulmonary lobes within lung fields on LDCT images. The all-lobes Dice coefficient of the test set was 0.944, the Jaccard coefficient of the test set was 0.896, and the Hausdorff distance of the test set was 92.908 mm. This deep learning algorithm for pulmonary lobes segmentation will probably benefit new research directions, such as assessing emphysema and pulmonary fibrosis, planning radiotherapy and surgical procedures, for lung cancer LDCT screening in the future.
Acknowledgments
Funding: This work was supported by the National Key R&D Program of China (2017YFC1308700), the CAMS Innovation Fund for Medical Sciences (2017-I2M-1-005), and the CAMS Innovation Fund for Medical Sciences (2019-I2M-2-002).
Footnote
Reporting Checklist: The authors have completed the STARD reporting checklist. Available at http://dx.doi.org/10.21037/atm-20-5060
Data Sharing Statement: Available at http://dx.doi.org/10.21037/atm-20-5060
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at http://dx.doi.org/10.21037/atm-20-5060). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This retrospective study was approved by the medical ethics committee of National Cancer Center/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (NO.: 19-018/1840) and written informed consent was waived due to the retrospective nature of the study.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018;68:394-424. Erratum in: CA Cancer J Clin. 2020 Jul;70(4):313. doi: 10.3322/caac.21609. Epub 2020 Apr 6. [Crossref] [PubMed]
- International Early Lung Cancer Action Program Investigators; Henschke CI, Yankelevitz DF, et al. Survival of patients with stage I lung cancer detected on CT screening. N Engl J Med 2006;355:1763-71. [Crossref] [PubMed]
- Wang J, Betke M, Ko J P. Pulmonary fissure segmentation on CT. Med Image Anal 2006;10:530-47. [Crossref] [PubMed]
- Lassen B, van Rikxoort EM, Schmidt M, et al. Automatic segmentation of the pulmonary lobes from chest CT scans based on fissures, vessels, and bronchi. IEEE Trans Med Imaging 2013;32:210-22. [Crossref] [PubMed]
- Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE 1998;86:2278-324. [Crossref]
- Krizhevsky A, Sutskever I, Hinton GE, et al. ImageNet Classification with Deep Convolutional Neural Networks[C]. Neural Information Processing Systems, 2012:1097-105. Available online: https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
- Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. Computer Vision and Pattern Recognition, arXiv:1409.1556 [cs.CV], 2014.
- Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. Computer Vision and Pattern Recognition, arXiv:1409.4842 [cs.CV], 2015.
- He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 27-30 June 2016; Las Vegas, NV, USA, IEEE, 2016.
- Huang G, Liu Z, Van Der Maaten L, et al. Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 21-26 July 2017; Honolulu, HI, USA, IEEE, 2017.
- Ding J, Li A, Hu Z, et al. Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep Convolutional Neural Networks. In: Descoteaux M, Maier-Hein L, Franz A, et al. editors. Medical Image Computing and Computer Assisted Intervention − MICCAI 2017. MICCAI 2017. Lecture Notes in Computer Science, vol 10435. Springer, Cham, 2017:559-67.
- Shen W, Zhou M, Yang F, et al. Multi-scale Convolutional Neural Networks for Lung Nodule Classification. Inf Process Med Imaging 2015;24:588-99. [Crossref] [PubMed]
- Kisilev P, Walach E, Barkan E, et al. From medical image to automatic medical report generation. IBM J Res Dev 2015;59:2:1-2:7.
- Norman B, Pedoia V, Majumdar S. Use of 2D U-Net Convolutional Neural Networks for Automated Cartilage and Meniscus Segmentation of Knee MR Imaging Data to Determine Relaxometry and Morphometry. Radiology 2018;288:177-85. [Crossref] [PubMed]
- Men K, Zhang T, Chen X, et al. Fully automatic and robust segmentation of the clinical target volume for radiotherapy of breast cancer using big data and deep learning. Phys Med 2018;50:13-9. [Crossref] [PubMed]
- Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. ICCV '15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), 2015:1520-8.
- Shelhamer E, Long J, Darrell T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans Pattern Anal Mach Intell 2017;39:640-51. [Crossref] [PubMed]
- Gibson E, Giganti F, Hu Y, et al. Automatic Multi-Organ Segmentation on Abdominal CT With Dense V-Networks. IEEE Trans Med Imaging 2018;37:1822-34. [Crossref] [PubMed]
- Imran AAZ, Hatamizadeh A, Ananth SP, et al. Fast and automatic segmentation of pulmonary lobes from chest CT using a progressive dense V-network. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization 2019. [Crossref]
- Gerard SE, Reinhardt JM. Pulmonary lobe segmentation using a sequence of convolutional neural networks for marginal learning. 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019); 8-11 April 2019; Venice, Italy, Italy, IEEE, 2019:1207-11.
- Xie W, Jacobs C, Charbonnier JP, et al. Relational Modeling for Robust and Efficient Pulmonary Lobe Segmentation in CT Scans. IEEE Trans Med Imaging 2020;39:2664-75. [Crossref] [PubMed]
- Fedorov A, Beichel R, Kalpathy-Cramer J, et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn Reson Imaging 2012;30:1323-41. [Crossref] [PubMed]
- Gibson E, Li W, Sudre C, et al. NiftyNet: a deep-learning platform for medical imaging. Comput Methods Programs Biomed 2018;158:113-22. [Crossref] [PubMed]
- Nair V, Hinton GE. Rectified Linear Units Improve Restricted Boltzmann Machines. ICML'10: Proceedings of the 27th International Conference on International Conference on Machine Learning, 2010:807-14.
- Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on Machine Learning, PMLR, 2015;37:448-56.
- Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 2014;15:1929-58.
- Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. 2016 Fourth International Conference on 3D Vision (3DV). IEEE, 2016:565-71.
- Rote G. Computing the minimum Hausdorff distance between two point sets on a line under translation. Information Processing Letters 1991;38:123-7. [Crossref]
- Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307-10. [Crossref] [PubMed]
- Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N, Hornegger J, Wells W, et al. editors. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Springer, Cham, 2015. doi: 10.1007/978-3-319-24574-4_28. [Crossref]
- Way T, Chan HP, Hadjiiski L, et al. Computer-aided diagnosis of lung nodules on CT scans: ROC study of its effect on radiologists' performance. Acad Radiol 2010;17:323-32. [Crossref] [PubMed]
- Cronin P, Gross BH, Kelly AM, et al. Normal and accessory fissures of the lung: evaluation with contiguous volumetric thin-section multidetector CT. Eur J Radiol 2010;75:e1-8. [Crossref] [PubMed]
- Linguraru MG, Pura JA, Pamulapati V, et al. Statistical 4D graphs for multi-organ abdominal segmentation from multiphase CT. Med Image Anal 2012;16:904-14. [Crossref] [PubMed]
- Chowdhury N, Toth R, Chappelow J, et al. Concurrent segmentation of the prostate on MRI and CT via linked statistical shape models for radiotherapy planning. Med Phys 2012;39:2214-28. [Crossref] [PubMed]