Advertisement

Predicting Malignancy Risk of Screen-Detected Lung Nodules–Mean Diameter or Volume

Open ArchivePublished:October 24, 2018DOI:https://doi.org/10.1016/j.jtho.2018.10.006

      Abstract

      Objective

      In lung cancer screening practice low-dose computed tomography, diameter, and volumetric measurement have been used in the management of screen-detected lung nodules. The aim of this study was to compare the performance of nodule malignancy risk prediction tools using diameter or volume and between computer-aided detection (CAD) and radiologist measurements.

      Methods

      Multivariable logistic regression models were prepared by using data from two multicenter lung cancer screening trials. For model development and validation, baseline low-dose computed tomography scans from the Pan-Canadian Early Detection of Lung Cancer Study and a subset of National Lung Screening Trial (NLST) scans with lung nodules 3 mm or more in mean diameter were analyzed by using the CIRRUS Lung Screening Workstation (Radboud University Medical Center, Nijmegen, the Netherlands). In the NLST sample, nodules with cancer had been matched on the basis of size to nodules without cancer.

      Results

      Both CAD-based mean diameter and volume models showed excellent discrimination and calibration, with similar areas under the receiver operating characteristic curves of 0.947. The two CAD models had predictive performance similar to that of the radiologist-based model. In the NLST validation data, the CAD mean diameter and volume models also demonstrated excellent discrimination: areas under the curve of 0.810 and 0.821, respectively. These performance statistics are similar to those of the Pan-Canadian Early Detection of Lung Cancer Study malignancy probability model with use of these data and radiologist-measured maximum diameter.

      Conclusion

      Either CAD-based nodule diameter or volume can be used to assist in predicting a nodule's malignancy risk.

      Keywords

      Introduction

      Following recommendations by the U.S. Preventive Services Task and Centers for Medicare and Medicaid Services and the Canadian Task Force on Preventive Health Care that support use of low-dose computed tomography (LDCT) of the chest to decrease lung cancer mortality,
      • Humphrey L.L.
      • Deffebach M.
      • Pappas M.
      • et al.
      Screening for lung cancer with low-dose computed tomography: a systematic review to update the U.S. Preventive Services Task Force recommendation.
      • Lewin G.
      • Morissette K.
      • Dickinson J.
      • et al.
      Recommendations on screening for lung cancer.
      LDCT screening is being implemented in the United States and Canada. One concern with population screening is the large number of lung nodules that are identified by LDCT, most of which are benign. A number of lung nodule management guidelines and nodule malignancy risk prediction tools have been developed to guide management. When two or more scans are available, the management of lung nodules is straightforward because nodule growth is the most important indicator of malignancy. The first screening LDCT is more challenging because it has important implication in the rate of early-recall imaging studies for intermediate or suspicious nodules and the frequency of the next screening LDCT, such as an annual or biennial repeat. The PanCan nodule malignancy probability model (PanCan model [synonyms are the McWilliams model, Brock model, and Tammemägi model]) was developed on the basis of nodules detected and followed from the first screen in the prospective Pan-Canadian Early Detection of Lung Cancer Study (PanCan study).
      • McWilliams A.
      • Tammemagi M.
      • Mayo J.
      • et al.
      Probability of cancer in pulmonary nodules detected on first screening computed tomography.
      This model was initially validated externally with use of British Columbia Cancer screening data, and it has subsequently been validated in multiple studies in different settings by different research groups.
      • Al-Ameri A.
      • Malhotra P.
      • Thygesen H.
      • et al.
      Risk of malignancy in pulmonary nodules: a validation study of four prediction models.
      • Winkler Wille M.M.
      • van Riel S.J.
      • Saghir Z.
      • et al.
      Predictive accuracy of the PanCan lung cancer risk prediction model -external validation based on CT from the Danish Lung Cancer Screening Trial.
      • Zhao H.
      • Marshall H.M.
      • Yang I.A.
      • et al.
      Screen-detected subsolid pulmonary nodules: long-term follow-up and application of the PanCan lung cancer risk prediction model.
      • van Riel S.J.
      • Ciompi F.
      • Jacobs C.
      • et al.
      Malignancy risk estimation of screen-detected nodules at baseline CT: comparison of the PanCan model, Lung-RADS and NCCN guidelines.
      • Nair V.S.
      • Sundaram V.
      • Desai M.
      • Gould M.K.
      Accuracy of models to identify lung nodule cancer risk in the National Lung Screening Trial.
      The PanCan risk calculator has been recommended for use in some settings by the British Thoracic Society Guidelines for the Investigation and Management of Nodules and the American College of Radiology’s Lung Imaging Reporting and Data System.
      • Callister M.E.
      • Baldwin D.R.
      • Akram A.R.
      • et al.
      British Thoracic Society guidelines for the investigation and management of pulmonary nodules.
      • Baldwin D.R.
      • Callister M.E.
      Guideline Development Group
      The British Thoracic Society guidelines on the investigation and management of pulmonary nodules.
      ACR American College of Radiology
      Lung CT screening reporting and data system (Lung-RADS).
      In the PanCan model, the nodule size used for risk prediction was the maximum length as measured manually by radiologists. Published guidelines use two-dimensional perpendicular mean diameter
      ACR American College of Radiology
      Lung CT screening reporting and data system (Lung-RADS).
      or volumetric measurement
      • Horeweg N.
      • van der Aalst C.M.
      • Vliegenthart R.
      • et al.
      Volumetric computed tomography screening for lung cancer: three rounds of the NELSON trial.
      or a combination of both.
      • Callister M.E.
      • Baldwin D.R.
      • Akram A.R.
      • et al.
      British Thoracic Society guidelines for the investigation and management of pulmonary nodules.
      • Baldwin D.R.
      • Callister M.E.
      Guideline Development Group
      The British Thoracic Society guidelines on the investigation and management of pulmonary nodules.
      Volumetric measurement uses computer-aided detection (CAD)
      • Horeweg N.
      • van der Aalst C.M.
      • Vliegenthart R.
      • et al.
      Volumetric computed tomography screening for lung cancer: three rounds of the NELSON trial.
      tools whereas measurement of mean diameter can also be performed manually with electronic calipers. CAD-based measurements reduce interobserver variation but require dedicated software tools. The aim of this study was to compare the performance of nodule malignancy risk prediction tools using diameter or volume and between and radiologist measurements.

      Methods

      The current study uses data from the PanCan study, which has been described elsewhere.
      • Tammemagi M.C.
      • Schmidt H.
      • Martel S.
      • et al.
      Participant selection for lung cancer screening by risk modelling (the Pan-Canadian Early Detection of Lung Cancer [PanCan] study): a single-arm, prospective study.
      In brief, the PanCan study was a single-arm prospective lung cancer screening study in which 2537 individuals at high risk for lung cancer were screened for lung cancer using LDCT in eight centers across Canada. Entry into the study was based on having a 2% or higher 6-year risk according to a prototype of the Prostate, Lung and Ovarian Cancer Screening Trial model 2012 risk prediction model.
      • Tammemagi M.C.
      • Katki H.A.
      • Hocking W.G.
      • et al.
      Selection criteria for lung-cancer screening.
      In the current study, baseline PanCan study scans that had radiologist-detected nodules were reread using the CIRRUS CAD system (CIRRUS Lung Screening, Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, the Netherlands). The list of malignant lung nodules reported previously
      • McWilliams A.
      • Tammemagi M.
      • Mayo J.
      • et al.
      Probability of cancer in pulmonary nodules detected on first screening computed tomography.
      was updated.
      To validate the models, we used CAD read data from a subset of NLST scans. The data set was obtained from the NLST repository. The sampling of the NLST data was driven by alternative study questions to measure airways and lung parenchyma features associated with lung cancer or no lung cancer. The samples included three categories: (1) LDCT scans from all participants in whom lung cancer had been diagnosed within 5 years of the baseline scan on the basis of size of the largest recorded lung nodule (if present); (2) baseline LDCT scans from participants with no lung cancer who were matched by year of entry, study site, smoking status (current versus former smokers), and nodule closest in size; and (c) prior LDCT scans from individuals with interval lung cancer.

      Semiautomated Reading of Screening LDCT Scans

      A research workstation specifically designed to read screening LDCT scans (CIRRUS Lung Screening, Diagnostic Image Analysis Group) was used. CIRRUS Lung Screening integrates three CAD software packages that identify solid and subsolid lung nodules. The performance of the CIRRUS Lung Screening software has been previously reported.
      • Murphy K.
      • van Ginneken B.
      • Schilham A.M.
      • de Hoop B.J.
      • Gietema H.A.
      • Prokop M.
      A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification.
      • Jacobs C.
      • Sanchez C.I.
      • Saur S.C.
      • Twellmann T.
      • de Jong P.A.
      • van Ginneken B.
      Computer-aided detection of ground glass nodules in thoracic CT images using shape, intensity and context features.
      • Jacobs C.
      • van Rikxoort E.M.
      • Twellmann T.
      • et al.
      Automatic detection of subsolid pulmonary nodules in thoracic computed tomography images.
      The lung nodules were automatically segmented. The radiologist checked the results and edited the segmentation if needed. False-positive nodules were removed. False-negative nodules 4 mm or larger were added by the human observer for segmentation by CIRRUS. Calcified or noncalcified lung nodules 3 mm or smaller were excluded from analysis. Perifissural nodules defined as small solid nodules adjacent to pleural fissures that are thought to represent intrapulmonary lymph nodes were also excluded from analysis in keeping with the Fleischner Society guideline and other published studies.
      • McWilliams A.
      • Tammemagi M.
      • Mayo J.
      • et al.
      Probability of cancer in pulmonary nodules detected on first screening computed tomography.
      • Ahn M.I.
      • Gleeson T.G.
      • Chan I.H.
      • et al.
      Perifissural nodules seen at CT screening for lung cancer.
      • de Hoop B.
      • van Ginneken B.
      • Gietema H.
      • Prokop M.
      Pulmonary perifissural nodules on CT scans: rapid growth is not a predictor of malignancy.
      • MacMahon H.
      • Naidich D.P.
      • Goo J.M.
      • et al.
      Guidelines for management of incidental pulmonary nodules detected on CT images: from the Fleischner Society 2017.
      The radiologists who read the scans made the decision.

      Statistical Methods

      Descriptive statistics were prepared by using contingency table analysis for categorical data and Fisher’s exact test. The 95% confidence intervals (CIs) for proportions were estimated by using the binomial exact method. Comparisons of nonnormal and normal continuous data were made by using the nonparametric test of trend
      • Cuzick J.
      A Wilcoxon-type test for trend.
      and Student’s t test, respectively.
      Multivariable logistic regression models evaluated the same set of predictors as were included in the original PanCan study model.
      • McWilliams A.
      • Tammemagi M.
      • Mayo J.
      • et al.
      Probability of cancer in pulmonary nodules detected on first screening computed tomography.
      The outcome was lung cancer, and the radiological predictor characteristics were CAD-derived. Variables with effect estimates that approached the null and did not improve prediction were excluded from the final models. Predictor variables with effect estimates that were in the same direction and of a magnitude similar to those observed in the original model were retained in the CAD models even when they were not statistically significant.

      Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. 2nd ed. 2015. New York: Springer-Verlag.

      The nodule two-dimensional perpendicular mean diameter and volume were right-skewed and were natural log–transformed. Nonlinear effects of continuous variables were evaluated by using locally weighted scatterplot smoothing plots and multivariable fractional polynomials.
      • Royston P.
      • Sauerbrei W.
      Multivariable Model-Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables.
      Because some individuals had multiple nodules, the variances of effect estimates were adjusted for clustering of data within individuals by using the Huber and White robust “sandwich” variance estimator.
      • White H.
      A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.
      A model’s overall predictive performance was assessed by using the Brier score.
      • Rufibach K.
      Use of Brier score to assess binary predictions.
      Discrimination (ability to classify correctly) was measured by using the receiver operator characteristic area under the curve (AUC). Calibration (whether the model-predicted probabilities matched the observed probabilities) was assessed by using the Spiegelhalter z statistic p value (with significance indicating poor calibration) and by comparing the mean and observed overall probabilities of outcomes. Brier scores and AUCs have been presented with bootstrap bias–corrected 95% CIs with bootstrapping using 1000 resamplings.
      • Pepe M.S.
      • Longton G.
      • Janes H.
      Estimation and comparison of receiver operating characteristic curves.
      The predictive performances of the CAD mean diameter and volume models were compared with that of the original PanCan model in the current development and validation samples.
      All reported p values are two sided, unless otherwise indicated. Statistics and figures were prepared by using Stata 15.1 MP software (StataCorp, College Station, TX).

      Results

      The characteristics of the study participants and nodules in the development and validation samples are described in Table 1.
      Table 1Characteristics of Individuals and Nodules in the Analytic Development PanCan Study Sample and in the Validation NLST Subsample, Stratified by Lung Cancer Status
      DataNo Lung CancerLung Cancerp ValueTotal
      PanCan data
       Person leveln = 1600n = 111N = 1711
      Age, y62.5 (SD 5.8)63.4 (SD 5.7)0.09862.5 (SD 5.8)
      Sex
      Female741 (46.3%)61 (54.9%)802 (46.9%)
      Male859 (53.7%)50 (45.1%)0.094909 (53.1%)
      Family history of lung cancer
      No1069 (66,8%)69 (62.2%)1138 (66.5%)
      Yes531 (33.2%)42 (37.8%)0.349573 (33.5%)
      Emphysema
      No653 (40.8%)29 (26,1%)682 (39.9%)
      Yes947 (59.2%)82 (73.9%)0.0021029 (60.1%)
      Pack-years smoked54.9 (SD 24.0)54.8 (SD 23.4)0.96754.9 (SD 24.0)
       Nodule leveln = 5662n = 117N = 5779
      Spiculation
      No5517 (97.4%)79 (67.5%)5596 (96.8%)
      Yes145 (2.6%)38 (32.5%)0.001183 (3.2%)
      Upper lobe location
      No3021 (52.4%)44 (37.6%)3065 (53.0%)
      Yes2641 (46.6%)73 (62.4%)0.0012714 (47.0%)
      Nodule type
      Nonsolid983 (17.4%)26 (22.2%)1009 (17.5%)
      Semisolid278 (4.9%)33 (28.2%)311 (5.4%)
      Solid4401 (77.7%)58 (49.6%)0.0014459 (77.2%)
      Nodule count, median5 (IQR 3–8)4 (IQR 3–8)0.2835 (IQR 3–8)
      Nodule diameter, mm, mean5.4 (SD 3.2)16.8 (SD 10.2)<0.0015.6 (SD 3.8)
      Nodule volume, mm, mean169.0 (SD 1261.3)4669.7 (SD 13304.6)<0.001260.2 (SD 2348.2)
      NLST data
       Person leveln = 3239n = 441N = 3680
      Age62.1 (SD 5.2)63.7 (SD 5.3)0.00162.3 (SD 5.3)
      Sex
      Female1426 (44.0.7%)180 (40.8%)1606 (43.6%)
      Male1813 (56.0%)261 (59.2%)0.2192074 (56.4%)
      Family history of lung cancer
      No2514 (77.6%)324 (73.5%)2838 (77.1%)
      Yes725 (22.4%)117 (26.5%)0.053842 (22.9%)
      Emphysema
      No2131 (65.8%)253 (57.4%)2384 (64.8%)
      Yes1108 (34.2%)188 (42.6%)0.0011296 (35.2%)
       Nodule leveln = 5549n = 460N = 6009
      Spiculation
      No4973 (89.6%)286 (62.2%)5259 (87.5%)
      Yes576 (10.4%)174 (37.8%)0.001750 (12.5%)
      Upper lobe location
      No3205 (57.8%)183 (39.8%)3388 (56.4%)
      Yes2344 (42.2%)277 (60.2%)0.0012621 (43.6%)
      Nodule type
      Nonsolid1451 (26.2%)24 (5.2%)1475 (24.5%)
      Semisolid524 (9.4%)109 (23.7%)633 (10.5%)
      Solid3574 (64.4%)327 (71.1%)0.0013901 (64.9%)
      Nodule count (median)2 (IQR 1–3)1 (IQR 1–1)0.0012 (IQR 1–3)
      Nodule diameter, mm (mean)7.9 (SD 5.5)16.8 (SD 12.1)0.0018.6 (SD 6.6)
      Nodule volume (mean)605.6 (SD 3031.0)6289.4.7 (SD 18214.3)0.0011040.7 (SD 6009.4)
      IQR, interquartile range; NLST, National Lung Screening Trial; PanCan, Pan-Canadian Early Detection of Lung Cancer Study.
      In the PanCan data included in the analysis, there were 1711 individuals who had 5779 nodules and 117 lung cancers (2.0% of nodules). The PanCan CAD mean size diameter and volume models are described in Table 2. Both models include sex, family history of lung cancer, emphysema, nodule type, spiculation, nodule location, nodule count, and nodule size. Sex and family history of lung cancer and emphysema were not statistically significant, but they had effect estimates in a direction and magnitude consistent with those of the original PanCan model. Age, which is a predictor present in the original PanCan model, was excluded from the current models because its effect estimate approached the null value and it did not contribute to prediction. In the original PanCan model, pure nonsolid (ground glass) nodules carried a reduced risk compared with solid nodules. In the current data, the risk of ground glass nodules was found to not differ substantially from that of solid nodules and they were pooled together. As was observed in our original PanCan model, the relationship between nodule size and lung cancer probability is nonlinear. Figure 1 describes the nonlinear relationships between mean diameter and volume and lung cancer risk according to the two PanCan CAD models.
      Table 2Parameter Estimates for the PanCan Model Based on CAD Reading of Scans and Nodule Size Estimated by Mean Diameter or Volume
      PredictorMean Diameter Model
      In the mean nodule model, the following transformations apply: Nodule countT = Nodule count – 4 and Mean nodule diameterT = ([LN Mean nodule diameter–1]) – .6222482239.
      Volume Model
      In the volume model, the following transformations apply: Nodule countT = Nodule count – 4 and VolumeT = ([LN volume/10]–.5) – 1.619158938.
      OR (95% CI), p ValueBeta CoefficientOR (95% CI, p Value)Beta Coefficient
      Sex (F vs. M)1.45 (0.95–2.3), p = 0.09.37493241.44 (0.93–2.22), p = 0.10.3641748
      Family history lung cancer1.38 (0.89–2.13), p = 0.15.32030231.35 (087–2.10), p = 0.19.2980618
      Emphysema (yes vs. no)1.33 (0.83–2.13), p = 0.23.28787581.29 (0.80–2.10), p = 0.30.2535959
      Nodule type (semisolid vs. other)2.01 (1.19–3.40), p = 0.009.70048662.10 (1.23–3.59), p = 0.006.7439433
      Nodule spiculation (yes vs. no)2.64 (11.53–4.54), p < 0.001.96988752.59 (1.48–4.52), p = 0.001.9501913
      Nodule location (upper vs. other)1.65 (1.06–2.58), p = 0.03.50291891.65 (1.05–2.60), p = 0.03.5012571
      Nodule countT0.92
      In the mean nodule model, the following transformations apply: Nodule countT = Nodule count – 4 and Mean nodule diameterT = ([LN Mean nodule diameter–1]) – .6222482239.
      (0.88–0.96), p < 0.001
      –.0852886
      In the mean nodule model, the following transformations apply: Nodule countT = Nodule count – 4 and Mean nodule diameterT = ([LN Mean nodule diameter–1]) – .6222482239.
      0.92⊥ (0.88–0.96), p < 0.001–.0865209⊥
      Nodule sizeTNA,
      In the mean nodule model, the following transformations apply: Nodule countT = Nodule count – 4 and Mean nodule diameterT = ([LN Mean nodule diameter–1]) – .6222482239.
      p <0.001
      –16.12322
      In the mean nodule model, the following transformations apply: Nodule countT = Nodule count – 4 and Mean nodule diameterT = ([LN Mean nodule diameter–1]) – .6222482239.
      NA⊥, p < 0.001–9.228516⊥
      constant–6.535526–6.443251
      Predictive performance statistics for the PanCan development data (N = 5779)
       Brier score0.015 (0.013–0.018)
      Bootstrap bias – corrected 95% confidence intervals.
      0.015 (0.013–0.018)
      Bootstrap bias – corrected 95% confidence intervals.
       AUC0.947 (0.921–0.964)
      Bootstrap bias – corrected 95% confidence intervals.
      0.947 (0.919–0.964)
      Bootstrap bias – corrected 95% confidence intervals.
       Observed/expected means0.0202/0. 0202 = 1.0000.0202/0.0202 = 1.000
       Spiegelhalter z statisticp = 0.425p = 0.364
      Predictive performance statistics for the NLST validation data (N=6009)
       Brier score0.064 (0.059–0.068)
      Bootstrap bias – corrected 95% confidence intervals.
      0.0594 (0.055–0.064)
      Bootstrap bias – corrected 95% confidence intervals.
       AUC0.810 (0.786–0.833)
      Bootstrap bias – corrected 95% confidence intervals.
      0.821 (0.797–0.843)
      Bootstrap bias – corrected 95% confidence intervals.
       Observed/expected means0.0766/0.0825 = 0.9280.0766/0.0826 = 0.927
       Spiegelhalter z statisticp < 0.001p < 0.001
      PanCan, Pan-Canadian Early Detection of Lung Cancer Study; CAD, computer-aided detection; F, female; M, male; CI, confidence interval; AUC, area under the curve; LN, natural log.
      a In the mean nodule model, the following transformations apply: Nodule countT = Nodule count – 4 and Mean nodule diameterT = ([LN Mean nodule diameter–1]) – .6222482239.
      b In the volume model, the following transformations apply: Nodule countT = Nodule count – 4 and VolumeT = ([LN volume/10]–.5) – 1.619158938.
      c Bootstrap bias – corrected 95% confidence intervals.
      Figure thumbnail gr1
      Figure 1The relationship between mean nodule diameter and probability of lung cancer as determined by the Pan-Canadian Early Detection of Lung Cancer Study computer-aided detection (CAD) mean diameter model (A) and between nodule volume and probability of lung cancer as determined by the Pan-Canadian Early Detection of Lung Cancer Study CAD volume model (B.)
      In the PanCan study data, both PanCan CAD models demonstrated excellent overall prediction (Brier scores of 0.015 for both), discrimination (AUC = 0.947 for both mean diameter and volume models [see Table 2 and Supplementary Fig. 1]), and calibration (Spiegelhalter p > 0.35 for both).
      In the NLST data set there were 6009 nodules in 3680 individuals; 460 nodules (7.7%) were lung cancer. In the external validation in NLST data, the PanCan CAD mean diameter and volume models demonstrated excellent discrimination (AUC = 0.810 and 0.821, respectively).
      When the models were restricted to 8- to 15-mm nodules, the overall predictive performance, including discrimination and calibration of the PanCan CAD mean diameter and volume models, remained good (see Supplementary Table 1).
      That both PanCan CAD model risk estimates separate lung cancer from noncancer nodules is graphically demonstrated in the PanCan study data (Fig. 2) and NLST data (Supplementary Fig. 2). Table 3 and Supplementary Figure 3 demonstrate that the sensitivity, specificity, and positive predictive value (PPV) for the PanCan CAD mean diameter and volume models did not differ substantially over the range of risk estimates in the PanCan data. A probability threshold of 0.05 has sensitivities, specificities, and PPVs of 83.8%, 92.8%, and 19.4% and 84.6%, 93.1%, and 20.1% for the mean diameter and volume models, respectively. A probability threshold of 0.06 has sensitivities, specificities, and PPVs of 78.6%, 93.8%, and 20.6% and 80.3%, 94.0%, and 21.7%, respectively.
      Figure thumbnail gr2
      Figure 2Distributions of estimated lung cancer probabilities in the Pan-Canadian Early Detection of Lung Cancer Study participants’ nodules stratified by lung cancer status as determined by the Pan-Canadian Early Detection of Lung Cancer Study computer-aided detection (CAD) mean diameter and volume models. The densities for no lung cancer are truncated to improve visualization of the distributions.
      Table 3PanCan CAD Mean Diameter and Volume Model Sensitivities, Specificities, and Positive Predictive Values by Different Model Probability Thresholds, in PanCan Data
      Threshold0.010.020.030.040.050.060.070.080.090.100.200.400.60
      Mean diameter model
       Sensitivity0.9490.8970.8800.8550.8380.7860.7780.7440.7180.7010.5210.2220.120
       Specificity0.8000.8690.8970.9150.9280.9380.9470.9530.9590.9620.9850.9960.998
       PPV0.0880.1240.1510.1720.1940.2060.2330.2480.2630.2770.4090.5200.583
       NPV0.9990.9980.9970.9970.9960.9950.9950.9950.9940.9940.9900.9840.982
      Volume model
       Sensitivity0.9570.9060.8630.8550.8460.8030.7440.7350.7180.6920.5040.2310.137
       Specificity0.7940.8700.9020.9190.9310.9400.9480.9560.9610.9630.9850.9950.998
       PPV0.0880.1260.1540.1780.2010.2170.2280.2570.2740.2790.4040.4740.640
       NPV0.9990.9980.9970.9970.9970.9960.9940.9940.9940.9930.9900.9840.982
      CAD, computer-aided detection; PanCan, Pan-Canadian Early Detection of Lung Cancer Study; PPV, positive predictive value; NPV, negative predictive value.
      When predictive performance was evaluated in nodules that were 20 mm or smaller, discrimination was excellent in the PanCan data and moderate in the NLST data and overall calibration when assessed by observed and/or predicted mean probabilities was very good in both data sets, although the significant Spiegelhalter p value indicates that some estimates are different from the observed values (Supplementary Table 2).
      When the original PanCan model parameters were applied to this study’s CAD-derived data, the AUC was 0.940 (95% CI: 0.912–0.959) versus 0.947 for the PanCan CAD models. Although the differences between the original PanCan maximum diameter model
      • McWilliams A.
      • Tammemagi M.
      • Mayo J.
      • et al.
      Probability of cancer in pulmonary nodules detected on first screening computed tomography.
      and the PanCan CAD mean diameter and volume models were statistically significantly different (p = 0.01 and p = 0.002, respectively), their difference in magnitude was not clinically meaningful. When the original PanCan model parameters were applied to this study’s data, the Brier score was 0.016, the respective observed and predicted mean probabilities were 0.0202 and 0.0328, and the Spiegelhalter z statistic p value was less than 0.001. These prediction statistics indicate that the PanCan CAD models provide prediction equal to or better than the original PanCan model in these training and testing data.
      Spreadsheet calculators for the PanCan CAD mean diameter and volume models are available free of charge to noncommercial users at https://brocku.ca/lung-cancer-screening-and-risk-prediction/risk-calculators/.

      Discussion

      In this study, the CAD mean diameter and volume models demonstrated excellent overall prediction in the development and validation data, predictive abilities similar to those of the previously validated PanCan model that was based on radiologist-read LDCT scans and maximum nodule size, and similar predictive abilities between the mean diameter and volume models. Both CAD mean diameter and volume models had AUCs of 0.947 and Brier scores of 0.015 in the PanCan data and AUCs of 0.810 and 0.821 and Brier scores of 0.064 and 0.0594 in the NLST data, respectively. In addition, the CAD models demonstrated reasonably high predictive performance when limited to nodules 8 mm to 15 mm in diameter (see Supplementary Table 1). In external validation, the previously published PanCan model demonstrated higher AUCs in its validation data than the CAD-based models demonstrated in the current validation data. However, this is in large part explained by limitations in the NLST sample used for validation.
      The NLST validation data set was developed for purposes other than the current study and was not ideal for validation of prediction models. Sampling NLST scans matched nodules without cancer on size to nodules that were cancer. This forced cancer nodules to be similar in size to noncancer nodules. Nodule size is the single most important predictor in models, and matching on nodule size reduces ability to predict cancer. In the PanCan data, each nodule was followed prospectively over time to identify which specific nodule was cancer; this was not the case in the NLST. Clinical judgement based on available data was used to decide whether a nodule was cancer. But in many cases, diagnosis of lung cancer came from a participant or proxy survey report or from a death registry. For the purposes of this study, it was assumed that the largest nodule was the one with development of cancer. In the PanCan study, 20% of cancer did not occur in the largest nodule. These issues add to measurement error in the NLST sample and biases prediction validation downward. NLST subset sampling included all interval cancers, which led to relative oversampling for prediction purposes. For interval cancers, it would not be possible to predict lung cancer on the basis of nodule characteristics because radiologically detected causal nodules were not detected on preceding scans. The NLST sample prevalence of lung cancer was higher than that observed in screening trials or programs. This is expected to lead to the appearance of poorer calibration when screening population–developed models are tested in lung cancer–enriched subset samples of the NLST. For the current study, the NLST subset sample was selected because of the limited availability of similar high-quality data sets. They are time-consuming and expensive to produce. Regardless of the numerous limitations of the NLST sample for prediction validation, the two CAD-based models demonstrated moderate to excellent prediction in NLST subset validation.
      The current study has potential limitations. Modeling was based on the CIRRUS CAD system. Whether model performance will differ with use of other CAD systems is unclear and needs to be evaluated. It has been shown that different software packages can result in significant differences in volumetric measurements.
      • Bankier A.A.
      • MacMahon H.
      • Goo J.M.
      • Rubin G.D.
      • Schaefer-Prokop C.M.
      • Naidich D.P.
      Recommendations for measuring pulmonary nodules at CT: a statement from the Fleischner Society.
      Until scanners and acquisition protocols can be standardized internationally, it is recommended that volumetric measurement be done by using the same scanner and acquisition protocol for comparison of serial imaging studies.
      • Rydzak C.E.
      • Armato S.G.
      • Avila R.S.
      • Mulshine J.L.
      • Yankelevitz D.F.
      • Gierada D.S.
      Quality assurance and quantitative imaging biomarkers in low-dose CT lung cancer screening.
      We did not explore other methods of image analysis such as the relative roundness of a lesion or the relationship between automated mean diameter and volume because the purpose of our study was to compare mean diameter with volumetric measurement to predict malignancy potential. Mean diameter measurements can be done manually when CAD software is not available.
      The models in the current study are not applicable to new nodules in annual repeat screening in which the malignancy risk of nodules smaller than 8 mm is higher than that of baseline nodules.
      • Pinsky P.F.
      • Gierada D.S.
      • Nath P.H.
      • Munden R.
      Lung cancer risk associated with new solid nodules in the National Lung Screening Trial.
      Currently, there are no validated prediction models for new nodules. In the American College of Radiology’s Lung Imaging Reporting and Data System, the nodule size threshold for early-recall LDCT for new nodules is 4 mm or larger instead of 6 mm or larger for baseline scans.
      ACR American College of Radiology
      Lung CT screening reporting and data system (Lung-RADS).
      Semisolid nodules were measured and included in the models by their overall size, not by the size of the solid component. In our original PanCan model, nonsolid nodules were found to be at reduced risk compared with solid nodules, and this observation was consistent with those found by others.
      ACR American College of Radiology
      Lung CT screening reporting and data system (Lung-RADS).
      In contrast, in the current study, nonsolid nodules were found to be at similar risk as solid nodules. Han et al. found that for detecting malignancy in pure nonsolid nodules, maximum cross-sectional area was superior to volume and other quantitative CT measures.
      • Han L.
      • Zhang P.
      • Wang Y.
      • et al.
      CT quantitative parameters to predict the invasiveness of lung pure ground-glass nodules (pGGNs).
      The extent to which the findings for nonsolid nodules in our current study reflect measurement errors resulting from inaccurate measurement of nonsolid nodules by a CAD system remains to be determined.
      Although a risk prediction model does not by itself constitute a nodule management protocol, an accurate risk prediction model can and possibly should be an important component of a nodule management protocol. This study provides support for using CAD-based nodule measurements in risk assessment and also provides additional support for using radiologist-based measurements because they are not inferior to CAD-based measurements. Preference can depend on local practice and availability of CAD software. The differential distribution of characteristics of malignant and nonmalignant lung nodules allows rational selection of malignancy risk threshold to guide management.
      • Tammemagi M.C.
      • Lam S.
      Screening for lung cancer using low dose computed tomography.
      However, evaluating the clinical utility of lung nodule management tools or systems requires prospective studies using quality indicators by frequency and result of early-recall computed tomography, positron emission tomography–computed tomography studies, lung cancer detection rate, stage shift, interval cancers, and harm (for example, from unnecessary biopsies and surgical procedures) in different settings with different CAD systems.

      Acknowledgments

      This study was funded by the Terry Fox Research Institute and British Columbia Cancer Foundation . The funding sources had no involvement in the study design, data collection, or analysis and interpretation of data.

      Supplementary Data

      References

        • Humphrey L.L.
        • Deffebach M.
        • Pappas M.
        • et al.
        Screening for lung cancer with low-dose computed tomography: a systematic review to update the U.S. Preventive Services Task Force recommendation.
        Ann Intern Med. 2013; 159: 411-420
        • U.S. Centers for Medicare and Medicaid Services
        Decision memo for screening for lung cancer with low dose computed tomography (LDCT) (CAG-00439N).
        (Accessed August 31, 2015)
        • Lewin G.
        • Morissette K.
        • Dickinson J.
        • et al.
        Recommendations on screening for lung cancer.
        CMAJ. 2016; 188: 425-432
        • McWilliams A.
        • Tammemagi M.
        • Mayo J.
        • et al.
        Probability of cancer in pulmonary nodules detected on first screening computed tomography.
        N Engl J Med. 2013; 369: 910-919
        • Al-Ameri A.
        • Malhotra P.
        • Thygesen H.
        • et al.
        Risk of malignancy in pulmonary nodules: a validation study of four prediction models.
        Lung Cancer. 2015; 89: 27-30
        • Winkler Wille M.M.
        • van Riel S.J.
        • Saghir Z.
        • et al.
        Predictive accuracy of the PanCan lung cancer risk prediction model -external validation based on CT from the Danish Lung Cancer Screening Trial.
        Eur Radiol. 2015; 25: 3093-3099
        • Zhao H.
        • Marshall H.M.
        • Yang I.A.
        • et al.
        Screen-detected subsolid pulmonary nodules: long-term follow-up and application of the PanCan lung cancer risk prediction model.
        Br J Radiol. 2016; 89: 20160016
        • van Riel S.J.
        • Ciompi F.
        • Jacobs C.
        • et al.
        Malignancy risk estimation of screen-detected nodules at baseline CT: comparison of the PanCan model, Lung-RADS and NCCN guidelines.
        Eur Radiol. 2017; 27: 4019-4029
        • Nair V.S.
        • Sundaram V.
        • Desai M.
        • Gould M.K.
        Accuracy of models to identify lung nodule cancer risk in the National Lung Screening Trial.
        Am J Respir Crit Care Med. 2018; 197: 120-123
        • Callister M.E.
        • Baldwin D.R.
        • Akram A.R.
        • et al.
        British Thoracic Society guidelines for the investigation and management of pulmonary nodules.
        Thorax. 2015; 70: ii1-ii54
        • Baldwin D.R.
        • Callister M.E.
        • Guideline Development Group
        The British Thoracic Society guidelines on the investigation and management of pulmonary nodules.
        Thorax. 2015; 70: 794-798
        • ACR American College of Radiology
        Lung CT screening reporting and data system (Lung-RADS).
        (Accessed April 27, 2017)
        • Horeweg N.
        • van der Aalst C.M.
        • Vliegenthart R.
        • et al.
        Volumetric computed tomography screening for lung cancer: three rounds of the NELSON trial.
        Eur Respir J. 2013; 42: 1659-1667
        • Tammemagi M.C.
        • Schmidt H.
        • Martel S.
        • et al.
        Participant selection for lung cancer screening by risk modelling (the Pan-Canadian Early Detection of Lung Cancer [PanCan] study): a single-arm, prospective study.
        Lancet Oncol. 2017; 18: 1523-1531
        • Tammemagi M.C.
        • Katki H.A.
        • Hocking W.G.
        • et al.
        Selection criteria for lung-cancer screening.
        N Engl J Med. 2013; 368: 728-736
        • Murphy K.
        • van Ginneken B.
        • Schilham A.M.
        • de Hoop B.J.
        • Gietema H.A.
        • Prokop M.
        A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification.
        Med Image Anal. 2009; 13: 757-770
        • Jacobs C.
        • Sanchez C.I.
        • Saur S.C.
        • Twellmann T.
        • de Jong P.A.
        • van Ginneken B.
        Computer-aided detection of ground glass nodules in thoracic CT images using shape, intensity and context features.
        Medical Image Comput Comput Assist Interv. 2011; 14: 207-214
        • Jacobs C.
        • van Rikxoort E.M.
        • Twellmann T.
        • et al.
        Automatic detection of subsolid pulmonary nodules in thoracic computed tomography images.
        Med Image Anal. 2014; 18: 374-384
        • Ahn M.I.
        • Gleeson T.G.
        • Chan I.H.
        • et al.
        Perifissural nodules seen at CT screening for lung cancer.
        Radiology. 2010; 254: 949-956
        • de Hoop B.
        • van Ginneken B.
        • Gietema H.
        • Prokop M.
        Pulmonary perifissural nodules on CT scans: rapid growth is not a predictor of malignancy.
        Radiology. 2012; 265: 61-616
        • MacMahon H.
        • Naidich D.P.
        • Goo J.M.
        • et al.
        Guidelines for management of incidental pulmonary nodules detected on CT images: from the Fleischner Society 2017.
        Radiology. 2017; 284: 228-243
        • Cuzick J.
        A Wilcoxon-type test for trend.
        Stat Med. 1985; 4: 87-90
      1. Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. 2nd ed. 2015. New York: Springer-Verlag.

        • Royston P.
        • Sauerbrei W.
        Multivariable Model-Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables.
        John Wiley & Sons, Hoboken, NJ2008
        • White H.
        A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.
        Econometrica. 1980; 48: 817-838
        • Rufibach K.
        Use of Brier score to assess binary predictions.
        J Clin Epidemiol. 2010; 63 ([author reply: 939]): 938-939
        • Pepe M.S.
        • Longton G.
        • Janes H.
        Estimation and comparison of receiver operating characteristic curves.
        Stata J. 2009; 9: 1-16
        • Bankier A.A.
        • MacMahon H.
        • Goo J.M.
        • Rubin G.D.
        • Schaefer-Prokop C.M.
        • Naidich D.P.
        Recommendations for measuring pulmonary nodules at CT: a statement from the Fleischner Society.
        Radiology. 2017; 285: 584-600
        • Rydzak C.E.
        • Armato S.G.
        • Avila R.S.
        • Mulshine J.L.
        • Yankelevitz D.F.
        • Gierada D.S.
        Quality assurance and quantitative imaging biomarkers in low-dose CT lung cancer screening.
        Br J Radiol. 2017; : 20170401
        • Pinsky P.F.
        • Gierada D.S.
        • Nath P.H.
        • Munden R.
        Lung cancer risk associated with new solid nodules in the National Lung Screening Trial.
        AJR Am J Roentgenol. 2017; 209: 1009-1014
        • Han L.
        • Zhang P.
        • Wang Y.
        • et al.
        CT quantitative parameters to predict the invasiveness of lung pure ground-glass nodules (pGGNs).
        Clin Radiol. 2018; 73: 504.e501-504.e507
        • Tammemagi M.C.
        • Lam S.
        Screening for lung cancer using low dose computed tomography.
        BMJ. 2014; 348: g2253