Introduction
Papillary thyroid cancer (PTC) is one of the most common malignancies involving the endocrine system, with a rising incidence rate in recent decades [1–3]. Long-term survival can be obtained for most patients with PTC after radical surgery, but recurrence may result in treatment failure [4, 5]. Thus, accurately identifying the cases at high risk of recurrence contributes importantly to formulation of individualized treatment and follow-up protocols by clinicians [6]. The tumor-node-metastasis staging system proposed by the American Joint Committee on Cancer (AJCC) is currently well recognized to evaluate the prognosis of TC patients [7]. However, patients with an identical pathological stage and therapeutic schedule may have markedly different clinical outcomes [8]. In recent years, radiomics, a novel technology capable of extracting numerous quantitative feature data from images through automatic algorithms, has been applied in the diagnosis, staging and prognostic evaluation of diseases [9–11].
Aim
The aim of this study was to investigate the efficacy of ultrasound-based radiomics for predicting the prognosis of patients with PTC who underwent complete endoscopic resection, and to compare it with the traditional AJCC staging system.
Material and methods
General clinical data
The general clinical data of patients who underwent complete endoscopic resection for PTC in our hospital between January 2010 and December 2014 were collected for retrospective case-control analysis.
The inclusion criteria were as follows: a) patients who received thyroid ultrasonography before surgery, with available images, b) those diagnosed with PTC by postoperative histopathology, c) those who underwent thyroid surgery for the first time, and d) those with complete pathological data.
The exclusion criteria involved: a) patients with malignancies in other organs, b) those with evidence of distant metastasis revealed by preoperative examination, c) those with tumors incompletely resected, d) those who received thyroid radiofrequency, microwave therapy or head and neck radiotherapy in the past, or e) those with non-papillary cancer components revealed by postoperative histopathology.
A total of 361 patients were included in the present study, including 72 males and 289 females aged 42 (21–79) years on average. Among them, there were 239 patients with tumor diameter ≤ 2 cm and 122 patients with tumor diameter > 2 cm. Extra-glandular invasion occurred in 74 patients and lymph node metastasis was present in 119 patients. Based on the AJCC staging system (8th edition), there were 289, 54 and 18 patients in stages I, II and III, respectively. According to the ratio of 7 : 3, the patients were assigned to the modeling group (n = 253) and the validation group (n = 108) using a random number table. The baseline data of the two groups were similar (p > 0.05) (Table I).
Table I
Ultrasonography
Thyroid ultrasonography was conducted for all patients before the operation using ultrasonic instruments Philips (HD15), Siemens (ACUSON2000) and General Electric (LogiqE8) with 10-15 MHz linear array probes of L12-5, 14L5 and 11L-D respectively. Thyroid lesion images were stored in the Digital Imaging and Communications in Medicine (DICOM) format.
Construction and evaluation of radiomics prediction model
Lesion area delineation
Using ITK-SNAP software (http://www.itksnap.org), an eligible ultrasound image was selected from each patient by an attending physician who had worked in the ultrasound department for more than 8 years, followed by lesion area delineation. The images should meet the following criteria: a) the lesion with most typical malignant features, b) the relationship between lesion and capsule was displayed, c) the lesion with a maximum diameter, and d) without measuring marks or Doppler imaging. For patients with multiple lesions, delineation should be performed for the largest one [10]. After all lesion areas were delineated, a chief physician with more than 20 years of working experience in the ultrasound department was responsible for evaluating images. Any objection should be processed through re-delineating corresponding areas by the attending physician after consultation. Both physicians were unaware of the patients’ information.
Extraction of radiomic features
The PyRadiomics open-source platform (v2.2.0, https://pyradiomics.readthedocs.io/) was employed to extract radiomic features from the lesion area of each patient, including first-order statistics, 2D shape, texture and wavelet features [12]. In total, 1,209 radiomic features were extracted from each lesion area.
Screening of radiomic features and modeling
In the modeling group (n = 253), the most representative radiomic features were selected using the three-step method. Firstly, univariate Cox regression analysis was employed to identify 119 features significantly associated with recurrence-free survival (p < 0.05). Secondly, Pearson’s correlation coefficient between every two features was calculated. After excluding the features hardly affecting recurrence-free survival among those with strong correlations (correlation coefficient: > 0.9), 33 features were screened. Thirdly, 7 radiomic features of the highest prognostic value were identified using least absolute shrinkage and selection operator (LASSO) regression analysis [13, 14]. The radiomics score (Rad-score) was the sum of each radiomic feature multiplied by the corresponding weight coefficient. X-tile software (v3.6.1) was utilized to analyze the optimal cut-off value of Rad-score for predicting recurrence-free survival. Furthermore, a nomogram prediction model was constructed by R software (v4.1.1) combined with Rad-score and clinical pathological factors [15].
Evaluation of prediction model
Harrell’s concordance index was applied to evaluate the prognostic discrimination ability of the prediction model [16]. The calibration curve was plotted to display the consistency between the predicted and actual survival rates. The Akaike information criterion (AIC) was used to compare the predictive ability of models, and a lower AIC value corresponded to higher predictive accuracy [17]. The likelihood ratio χ2 test was employed to evaluate the homogeneity of the prediction model, and a higher likelihood ratio χ2 value represented better homogeneity [18]. Net reclassification improvement (NRI) was utilized to access the predictive accuracies of different models [19]. Additionally, a decision curve was drawn to compare the clinical benefits of prediction models [20].
Postoperative follow-up
Follow-up was carried out through outpatient review or telephone to obtain the information about whether the disease recurred after surgery. The patients were reviewed by physical, laboratory and imaging examinations once every 3–6 months within 1 year after surgery, and then once every 6–12 months until December 2021. Overall survival was defined as the time from the operation to the last follow-up or death.
The endpoint of this study was progression-free survival, which was defined as the time from the operation to the first postoperative recurrence, or the time from the operation to the last follow-up or death if the disease did not recur during follow-up.
Statistical analysis
SPSS 22.0 software and R (v4.1.1) software were used for statistical analysis. Numerical data were expressed as percentages, and the χ2 test was employed for comparison between groups. A Kaplan-Meier survival curve was plotted to compare the recurrence-free survival, and the log-rank test was utilized to determine the significance. The Cox proportional hazards model was used for univariate and multivariate analyses. P < 0.05 represented statistically significant differences.
Results
Construction of ultrasound-based radiomics score in modeling group
In the modeling group (n = 253), 7 radiomic features were screened using the LASSO regression model, including original shape 2D PerimeterSurfaceRatio, original shape 2D Elongation, original glcm Autocorrelation, original glszm LGLZE, wavelet-LH glcm ClusterProminence, wavelet-HL ngtdm Contrast, and wavelet-HH glszm GLNN (Figure 1). As a result, the features such as shape, margin and echo pattern of tumor lesions were elucidated, with corresponding weight coefficients of 0.17132882, 0.09810219, 0.04008064, 0.11553653, 0.24268264, 0.15965165 and 0.09748579, respectively. The Rad-score ranged from –2.14 to 3.31.
Recurrence-free survival
The median follow-up time of the 361 included patients was 108 (8–137) months. The 5- and 10-year recurrence-free survival rates of modeling and validation groups were 92.9% vs. 95.4% and 87.4% vs. 89.8%, respectively.
According to X-tile software, the optimal cut-off values of Rad-score for predicting postoperative recurrence were 0.15 and 0.52, and the patients in modeling and validation groups were further divided into the low-risk group (Rad-score < 0.15, n = 244, 67.5%), medium-risk group (Rad-score: 0.15–0.52, n = 80, 22.2%) and high-risk group (Rad-score > 0.52, n = 37, 10.3%). Kaplan-Meier survival analysis showed that the 10-year recurrence-free survival rates were 94.7% vs. 95.9%, 83.6% vs. 80.0%, and 50.0% vs. 66.6%, respectively. Different risk groups had significantly different recurrence-free survival (χ2 = 15.805, 63.590, p < 0.001) (Figure 2).
Factors influencing recurrence-free survival
Univariate analysis revealed that age, tumor diameter, extra-glandular invasion, lymph node metastasis and Rad-score were significantly associated with the recurrence-free survival of the modeling group (p < 0.05). Multivariate analysis showed that age, lymph node metastasis and Rad-score were independent influencing factors for the recurrence-free survival of the modeling group (p < 0.05) (Table II).
Table II
Construction and evaluation of nomogram prediction model
Based on the above-mentioned findings of multivariate analysis, age, lymph node metastasis and Rad-score were selected as the independent prognostic predictors to construct a nomogram model for predicting the prognosis of the modeling group (Figure 3). Harrell’s concordance indices of the model for modeling and validation groups were 0.829 and 0.845, respectively, indicating high predictive accuracies. The calibration curve indicated that the recurrence-free survival predicted by the nomogram model was close to the actual value, suggesting high consistency (Figure 4).
Comparison between nomogram prediction model and AJCC staging system (8th edition)
The AIC values of the nomogram prediction model for modeling and validation groups were 287.02 and 70.35, respectively, which were superior to those of the AJCC staging system (8th edition) (321.54 and 83.92). The likelihood ratio χ2 values of the nomogram prediction model for modeling and validation groups were 56.07 and 24.65, respectively, also exceeding those of the AJCC staging system (8th edition) (21.56 and 11.08). The results of NRI analysis suggested that compared with the AJCC staging system (8th edition), the predictive accuracies of the nomogram prediction model for modeling and validation groups were augmented by about 65.4% and 43.9%, respectively. Furthermore, the nomogram prediction model was better than the AJCC staging system (8th edition) in terms of clinical benefits (Figure 5).
Discussion
In recent years, the incidence rate of TC has been rising worldwide, and it is predicted that TC will replace colorectal cancer as the fourth most common malignancy in 2030 [1, 3]. At present, TC is mainly treated by surgical resection. Since endoscopic thyroidectomy was first completed by Hüscher et al. [21] in the 1990s and thyroid malignancy was removed by Shimizu and Tanaka [22] through the subclavian approach with a small incision for the first time, the safety and feasibility as well as advantages such as minimally invasive technique and aesthetic effects of complete endoscopic thyroidectomy have been demonstrated gradually [23]. PTC is the most common type of TC. Although the prognosis of most patients with PTC is satisfactory, a few tumors are still highly invasive and distantly metastasize after radical resection [24, 25]. Consequently, much attention has been paid to identifying the high-risk factors for postoperative recurrence in such patients and formulating individualized regimens.
Age is recognized as a primary prognostic factor of TC, and tumor-specific death readily occurs in older patients [26, 27]. TC is the only malignancy that includes age in the AJCC staging system, verifying the importance of age for prognostic evaluation [7]. So far, the correlations of age with postoperative recurrence and prognosis of PTC remain elusive, which may be attributed to the association with the BRAF gene carried by patients, as reported by Shen et al. [28]. The disease mortality of patients carrying the BRAF V600E mutation was found to progressively rise with aging. Likewise, age was also an independent predictor of postoperative recurrence in patients with PTC in this study, suggesting that elderly patients should be followed up more frequently.
Lymph node metastasis has been closely associated with the postoperative recurrence of PTC [29, 30]. Compared with patients who are pathologically N0 (pN0), the recurrence rate remains significantly higher even in those with micro-infiltration of tumor cells in lymph nodes [31]. In this study, lymph node metastasis was discovered by postoperative pathology in about 1/3 of patients with PTC, and their 10-year recurrence-free survival rate was markedly lower than that of the cases without lymph node metastasis (80% vs. 91%). Hence, standardized and thorough lymph node dissection should be conducted for patients with preoperative lymph node metastasis.
Intratumor heterogeneity involves multiple spatial concepts from gene, protein, metabolism to physiology and anatomy, leading to various sensitivities of different individuals or even the same individual to the same treatment regimen [31]. Radiomics can extract numerous imaging features from medical images, allowing clinicians to quantitatively evaluate intratumor heterogeneity from a macroscopic perspective [32, 33] without requiring tissue biopsy or interventional surgery. These image heterogeneity parameters, as special biological markers, can predict the prognosis and treatment outcomes of patients. As reported by Xiong et al. [34], the ultrasound image-based radiomics prediction model was able to effectively predict recurrence-free survival rate in patients with invasive breast cancer. In a study conducted by Jiang et al. [35], the CT image-based radiomics prediction model was remarkably correlated with the clinical outcomes and chemotherapy sensitivity of patients with gastric cancer. In the present study, a radiomics prediction model (Rad-score) based on preoperative thyroid color Doppler ultrasonography was also constructed, according to which the patients were further divided into three risk groups with significant prognostic differences. The nomogram prediction model constructed with age, lymph node metastasis and Rad-score was prominently superior to the AJCC staging system in terms of predictive efficiency.
Conclusions
The ultrasound-based radiomics score is an important predictor for postoperative recurrence in patients with PTC undergoing complete endoscopic resection. By combining the radiomics score with other high-risk clinical factors, the nomogram prediction model can be utilized to formulate individualized treatment and follow-up protocols for patients at different risks. Regardless, this study is limited. This is a retrospective study with a small sample size, so the results may be biased. Further prospective studies with larger sample sizes are still needed to validate the results herein.