Early Lung Cancer Detection Using Deep Learning Optimization

—This paper proposes a computer aided detection (CADe) system for the early detection of lung nodules from low dose computed tomography (LDCT) images. The proposed system initially preprocesses the raw data to improve the contrast of the low dose images. Compact deep learning features are then extracted by investigating different deep learning architectures, including Alex, VGG16, and VGG19 networks. To optimize the extracted set of features, a genetic algorithm (GA) is trained to select the most relevant features for early detection. Finally, different types of classifiers are tested in order to accurately detect the lung nodules. The system is tested on 320 LDCT images from 50 different subjects, using an online public lung database, i.e., the International Early Lung Cancer Action Project, I-ELCAP. The proposed system, using VGG19 architecture and SVM classifier, achieves the best detection accuracy of 96.25%, sensitivity of 97.5%, and specificity of 95%. Compared to other state-of-the-art methods, the proposed system shows promising results.


Introduction
Worldwide, cancer is the second leading cause of death after cardiovascular diseases. The American Cancer Society (ACS) estimated a number of 1,762,450 new cancer cases in 2019, accounting for 606,880 deaths in the United States [1]. 13% of all these new cases are lung cancer (228,150 new cases), accounting for 142,670 deaths [1]. This number is about 24% of all cancer deaths, which is the highest mortality rate among all type of cancers [1]. Therefore, there is an urgent need to investigate how to improve the survival rate of lung cancer.
A recent study in the journal of nature medicine [2] reported that using low dose computed tomography (LDCT) can reduce the mortality rate by more than 20%, therefor it has been recommended in the typical United State screening. The main advantage of using LDCT screening is its ability to show high resolution chest images with safer x-ray dosage. Manual analysis of CT images to detect lung nodules is time consuming http://www.i-joe.org and subjective. In addition, it suffers from inter-and intra-subject variabilities. Therefore, several research teams have developed computer aided detection (CADe) systems in order to aid the radiologist to accurately detect nodules [3][4][5]. One of the main challenges in CADe system design is the detection of early cancerous nodules. At early lung cancer stage, the nodules are too small sizes, e.g., between 3 mm and 30 mm [6], to be detected. However, clinical reports show that early detection is a major key to significantly improve the survival rate of lung cancer [7].
This study aims at developing a CADe system for early lung cancer detection. To do this job, data from the International-Early Lung Cancer Action Project, I-ELCAP [8], are collected.  i. Initially preprocess the raw LDCT data in order to improve its contrast, ii. Extract compact deep learning features from the LDCT image, iii. Optimize the extracted features in order to improve the detection accuracy, iv. Classify the LDCT as normal or cancerous, based on the optimized feature vector.
The main contributions of the proposed CADe system are two-folds:

Fig. 2. Proposed
CADe system for nodule detection composed of four steps: Preprocessing, feature extraction using CNN, feature selection using offline genetic algorithm training, and classification, the output is either normal or cancerous The current paper is an extension of our published paper in [11], with the following contributions: i. Investigating more architectures rather than Alex network, including VGG16 and VGG19, ii. Investigating more classifiers rather than the supported vector machine (SVM) used in [11], including K-nearest neighbor (KNN), and decision trees, and iii. Adding more experimentations and discussions in order to precisely quantify the advantages and limitations of the proposed CADe system.
The rest of the paper is organized as following. Section 2 outlines the related work for lung nodule detection. Section 3 details the proposed CADe system. Section 4 demonstrates the experimental results and related discussions. Finally, Section 5 concludes the paper and outlines the future work.

Literature Review
In the literature, several research groups have developed different CADe systems for cancerous lung nodule detection. These systems can be categorized as traditional and deep Convolutional Neural Network (CNN) systems. In this section, each category will be overviewed, as well as its strength and limitations.

Traditional CADe systems
Traditional CADe systems involve four main steps: lung segmentation, feature extraction, nodule detection, and false positive reduction [5]. These methods usually extract different features from the segmented lung regions in order to differentiate between normal and cancerous lung images. The extracted features may include intensitybased features, statistical features, textures, and/or nodule shapes. For example, Setio et al. [12] applied a 3D lung segmentation technique to identify the lung regions from the CT images. A feature vector, composed of 24 intensity-based, blobness, shape, and texture features were extracted from the segmented lung regions in order to detect lung nodules. A radial basis SVM was further used for the detection of large nodules. Li et al. [13] segmented the lung regions using a threshold-based technique. A set of intensity-based and texture features, including contrast, energy, correlation, homogeneity, and grey level concurrence matrix (GLCM), were extracted from the segmented lungs. Then a SVM was used for classification. Amer et al. [14] segmented the lung regions using bi-thresholding and morphological operations. A set of statistical, texture, histogram-based, wavelet features were fused using a genetic algorithm, to perform early lung nodule detection. Although traditional CADe system has achieved a considerable success in the detection of lung nodules, more sophisticated features are still need to be investigated in order to afford more accurate detection, especially for early cases.

Deep CNN CADe systems
Recently, deep learning CNNs have shown a remarkable success for lung nodule detection [3][4]. In CNN, medical images are directly processed, most often without segmenting the lung fields, throughout several convolutional layers that work as spatially localized filters and fully connected layers. These CNN architectures involve a very large number of parameters or weights in order to encode the images into a compact high-level feature space. To adjust the CNN parameters, training should be performed using a very large number of data images to avoid under-fitting. The extracted deep learned feature vector has shown a significant capability to describe precisely the training data and distinguish between normal and cancerous lung nodules [15]. For example, Jin et al. [16] trained a 3D CNN architecture, composed of eleven layers, using the segmented lung regions for the task of lung nodule detection. Their method achieved a detection accuracy of 87.5%. Shen et al. [17] achieved a detection accuracy of 87.14% using a CNN architecture that involved a multi-crop pooling of convolutional layers. Wang et al. [18] applied a multi-level feature pyramid network followed by a 3D non-maximum suppression and a 3D CNN to achieve a sensitivity of 95.8% over the LUng Nodule Analysis (LUNA16) dataset [18]. More recently, Winkels et al. [19] showed that using 3D CNN with group convolutions was able to reduce the number of false positives for lung nodules detection. The promising results of using deep learning CADe systems throughout the literature encourage us to adopt using this approach to extract the LDCT image features.

Methods
The proposed CADe system includes four processing stages, illustrated in Figure 1. Firstly, the test LDCT image is input to a preprocessing stage, where contrast of image is improved. Secondly, a CNN model is used to extract a compact feature vector that describes the LDCT images. Thirdly, a feature selection stage is applied to select the most relevant features for the task of pulmonary nodule detection. The selected set of relevant features are determined offline using a smart genetic algorithm (GA), which significantly reduces the feature space dimensions and improves the detection accuracy. Finally, a classification stage is applied to determine whether the test LDCT image contains pulmonary nodules or not. In this section, we will detail the procedures of each of these stages.

Database description
The LDCT images are collected from an online publically available database, i.e., the Early Lung Cancer Action Project (ELCAP) database [8]. In order to test the proposed CADe system, 320 LDCT images from 40 different subjects are selected randomly, including 160 normal cross-sections, and 160 cancerous ones, to avoid the bias during experimentations. In order to account for early lung nodule detection, nodules are in the range between 3 mm to 30 mm. Images are of resolution of 0.76×0.76×1.25 mm. Typical normal and cancerous samples of LDCT images, from the I-ELCAP project, are exemplified in Figure. 2.

Preprocessing of LDCT images
Preprocessing stage is implemented in three steps, as shown in Figure 3. First, the contrast of the raw image is enhanced using the histogram stretching technique [19]. Second, a smoothing Wiener filter is applied in order to remove the scanner noise. Finally, the image is cropped to the standard size of the CNN model that is used for feature extraction (i.e., 227 × 227 for Alex architecture, and 224 × 224 for VGG16 and VGG19 architectures, see Table. 1).

Fig. 3.
Preprocessing raw LDCT data in three steps: histogram stretching to improve image contrast, image smoothing using Wiener filer, and cropping to the standard input size of the CNN model used for feature extraction

Feature extraction using (CNN)
To extract useful features that can describe the LDCT images, three popular CNN architectures are investigated, namely, Alex [9], VGG16, and VGG19 networks [10]. A detailed comparison between these architectures is provided in Table 1 [21]. To ovoid under-fitting, these architectures were originally trained over a large number of images (1.2 million for Alex network [9], and 1.3 Million for VGG16 and VGG19 [10], in order to adjust their huge number of parameters (~60, 138, and 144 million for Alex, VGG16, and VGG19, respectively). In our experiments, we investigate using these different CNN models to provide a reliable descriptor for LDCT images. Since the collected data size is small (320 LDCT images), the proposed CADe system applies transfer learning, where the parameters of the convolutional layers are kept unchanged (trained using the original data in [9,10]) and only the fully connected (FC) layers are trained and customized using the new database of interest. Transfer leaning is repeatedly applied by different research teams, showing a remarkable success [22]. To apply the transfer learning, the last fully connected (FC) layer (output layer) is customized based on the number of classes, i.e., two neurons in our experiments (normal or cancerous). In the proposed CADe system, the FC layer, just before the output layer, is used as a compact high-level feature descriptor (of length 4096 for all the three investigated models, as shown in Table 1 to precisely describe the LDCT image.

Genetic Algorithm (GA) feature optimization
Since transfer learning keeps the convolutional layers' parameters unchanged (are not trained with the database of interest), the extracted feature vector is generic and may involve redundant information. In order to select the most relevant features for lung nodule detection, a smart GA is designed. The output of the GA is a binary chromosome of length 4096 bits, evaluated offline at the training phase of the CADe system, in which a bit of logic '1' indicates that this feature is relevant and a bit of logic '0' indicates that this feature is irrelevant, so it is removed from the optimized feature vector used in the test phase. The detailed of GA design is illustrated in Figure. 4.
As shown in Figure 4, the parameters of the GA are set (e.g., population size, number of generations, mutation probability, etc.). Then, the GA initially generates a random population of binary chromosomes, each of the dimensions of size 4096, i.e., the size of the extracted feature vector from the CNN model. The fitness function is then evaluated for each chromosome in the population. This fitness function is defined as the accuracy of detection (AD) [23]: , , , and denotes the true positive, true negative, false positive, and false negative, respectively.
For feature selection, a subset of training images is iteratively processed by the GA (i.e., one half of the training images and the other half is used to train the classifier; the last step in the proposed CADe system). At each iteration, genetic operations (selection, cross over, and mutation) are applied in order to form a new population. The process is repeated until the detection accuracy (fitness, ) no longer improves. After GA termination, the chromosome of the best fitness (maximum detection accuracy, ) represents the optimal feature vector that should be used for nodule detection, where a bit of '1' in the chromosome means that the feature is selected and a bit of '0' means that the feature is not selected (see Figure 4). The proposed CADe system uses the following GA setting: tournament selection [24], with a tournament size of four, blending crossover [25], and adaptive feasible mutation [26]. The number of generations is set to a maximum of 100 generations.

Classification
The last step in the proposed CADe system is to classify the LDCT images into normal or cancerous, based on the reduced-dimension feature descriptor, optimized using the GA, trained using one half of the training images. The rest of the training images (the other half) is used to train the classifier. For this job, we investigate different number of classifiers, including, K-Nearest Neighbor (KNN), decision trees, and supported vector machines (SVM) with binary linear kernel in order to detect the pulmonary lung nodules.

Performance metrics
To evaluate the efficiency of the proposed CADe system, standard detection metrics are used on the test data, namely, the accuracy of detection ( ) (see Equation (1)), the specificity ( ), and the sensitivity ( ), defined as follows [23]:

Experimental Results and Discussion
Collected data (320 image) is divided into 75% training and 25% test (see Table. 2). Training is carried throughout two phases, each with equal number of training images. The first phase is GA training, in order to select the most relevant features for the task of pulmonary nodule detection. The second phase is classifier training, based on the selected feature vector, in order to improve the detection accuracy. The details of the train and test datasets are illustrated in Table. 2. Note that all experiments select equal number of normal and cancerous images to avoid any bias. In order to investigate the potential of the different CNN models to extract useful features that can describe the LDCT images, we test using all the feature vector of length 4096 of the last FC layer, just before the output layer of each model, for the task of pulmonary nodule detection. Three different types of classifiers are investigated in order to obtain the best detection accuracy, namely, KNN, decision trees, and SVM. As reported results in Table. 3, using SVM achieves the best detection accuracies, , of 88.8%, 91.2%, and 86.3% using Alex, VGG16, and VGG19 models, respectively. These results highlight that these CNN models can extract useful features that can be efficiently used for nodule detection. In addition, these results indicate the superior of using SVM over other classifiers due to its sophisticated nature.
To further optimize the feature descriptor, the proposed GA optimization is investigated, using the best type of classifier (SVM). As shown in Table. 3, the GA step can improve the detection accuracies to 92.5% ( [11]), 91.3%, 96.3% using Alex, VGG16, and VGG19, respectively. In addition, the GA optimization was able to reduce the feature dimensions from 4096 to 1822, 1925, and 1925, for Alex, VGG16, and VGG19, respectively. These results show the promise of using GA in the proposed CADe system to improve the overall system detection accuracy and to speed up the classifier, removing the burden of irrelevant features. Figure 5 shows visual samples of the correctly classified LDCT images from different subjects using the proposed CADe system with GA optimization and SVM classifier using the three different CNN models.  Fig. 5. Samples of correctly classified normal (first two columns) and cancerous (last two columns) images using the proposed CADe system, with Alex (first row), VGG16 (second row), and VGG 19 (third row) models To discuss the limitations of the proposed CADe system, Figure 6 exemplified samples of false positives and false negatives of our system. As shown in samples, the vascular structure is close to nodule shapes, leading to false positive and false negative cases. In the future, we will investigate the fusion of the different model features in order to improve the accuracy.

Conclusion
A CADe system for early pulmonary nodules is presented. The CADe system is based on transfer learning to get a generic feature descriptor and a smart genetic algorithm (GA) to select the most relevant features for early pulmonary cancer detection.
The CADe system accuracy is tested on an online publically available database; the Early Lung Cancer Action Project (ELCAP). The proposed CADe system shows promising results in early lung cancer detection with respect to the state-of-the-art methods. In the future, more databases will be investigated in order to stand on the robustness of the proposed system.