Introduction
Prostate cancer is a disease with a variable prognosis. Categorization into risk groups is based on: T-stage, maximum PSA concentration and Gleason score. In contrast, the ISUP Grading system introduced in 2014 is based solely on tumor differentiation. That system merges some groups and subdivide others such as Gleason score 7 into Grade Group 2 (GG2 – Gleason 3+4) and Grade Group 3 (GG3 – Gleason 4+3). The differences in a prognosis between these latter groups were most pronounced in patients undergoing radical prostatectomy (RP) [1]. Although many studies indicate such differences also in patients undergoing radiotherapy (RT), they seem to be less pronounced [1, 2, 3, 4, 5]; moreover, in some studies, significant differences were not found [6, 7]. One reason may be due to a less accurate diagnosis of a tumor grade in a biopsy specimen as compared to the post-prostatectomy one. The inaccuracy of biopsy is estimated to exceed 40% [8, 9, 10]. The second reason may be due to treatment-related factors, such as: pelvic lymph nodes irradiation and/or a combination with hormonal therapy, which may influence the natural history of prostate cancer and thus reduce potential differences between grade groups. To sum up, pathological grade is an important prognostic factor, but its reliability based on a biopsy is somewhat limited and its importance may be potentially modulated by therapeutic factors.
Hence, this study aimed to evaluate how pathologic grade is associated with the prognosis of patients undergoing radical radiotherapy in which the majority had concurrent hormonal treatment. Basically, the analysis was planned to focus on patients with a Gleason score 7 to assess the differences between GG2 and GG3. At the next step, the purpose of the study was to assess the differences in pathology findings, especially grading, between the original report from a biopsy performed a couple of years earlier when the study group was collected and a re-evaluation performed in recent years.
Material and methods
According to the defined entry criteria of the study group, it was composed of prostate cancer patients with a primary diagnosis of Gleason score 7 treated with radical external beam radiotherapy (RT) between 2008 and 2013. The aim was to compare the treatment outcome between Grade Group 2 (GG2) and Grade Group 3 (GG3) retrospectively. In the second part of the study, the re-evaluation of a pathology specimen was undertaken according to a contemporary manner of evaluation and reporting. Accordingly, it aimed to assess the differences in grade reporting between the original and contemporary report, with a special emphasis on grade migration. This part of the study was a prospective one, including not only a grade re-assessment but also evaluation of other pathological prognostic factors.
The study was approved by the Institutional Bioethical Committee, as a part of a MILESTONE project.
Material 1
There were 300 patients in the study group, with a preliminary diagnosis of Gleason 7 prostate cancer. In 202 patients it was GG2 and in 98 patients GG3. All patients underwent radical RT between 2008 and 2013. The patients were referred to radiation therapy by urologists, the majority of cases on hormonal therapy (88% of patients). As deferred patients’ referrals to radiotherapy and/or prolonged neoadjuvant hormonal treatment sometimes occurred, the original pathology examinations were performed in 2005-2012.
Radiotherapy was performed with 6-20 MV photons from a linear accelerator, conventionally fractionated to the total dose of 76 Gy, with dose per fraction of 2 Gy. Mainly dynamic techniques were used (80% patients), and almost all had image guidance (either bone- or fiducial-based).
The clinical characteristics of the study group with subdivision into GG2 and GG3 are presented in Table I. Hormonal treatment differed between these groups – 82% vs. 96% (p = 0.0008). With respect to treatment-related factors, in 101 patients (50%) from GG2 elective pelvic lymph node irradiation was performed in comparison to 67 patients (68%) from GG3 (p = 0.003). All other differences between the groups with respect to both clinical and treatment-related factors were insignificant.
Material 2
Among the previously analyzed 300 patients, 139 pathology specimens and reports were eligible for re-evaluation. It was not a random process, but it reflected the possibility of collecting the original tissues. The original biopsy report was performed in 31 patients (22%) in our own institution and in 108 patients (78%) in other, mainly regional, pathology departments. Verification was undertaken in 2018 by experienced prostate cancer pathologists (DL, ASW) in line with the most contemporary guidelines according to ISUP [1]. Verification was based on the original pathology specimens and paraffin blocks, but, when necessary, additional sections were undertaken.
The re-evaluation report included: Grade Group, presence of a cribriform pattern, perineural invasion, the number of involved cores and the mean percentage of core involvement, which was calculated as an arithmetic mean of percentage of invasion from each involved core.
Methods
Comparison of various clinical factors and radiotherapy parameters as well as differences in histopathological grade between groups was performed with the χ2 test for dependent and independent comparisons. Long-term treatment outcome was assessed in terms of actuarial biochemical control (BC) and biochemical disease-free survival (bDFS). bDFS included biochemical or clinical failure or patient death. BC was defined if during follow-up there was no biochemical failure according to the Phoenix definition (PSA increase of ≥ 2 ng/ml above the nadir). All clinical failures were proceeded by biochemical failures. BC and bDFS were calculated with the Kaplan-Meier method and differences between the groups were calculated with a log-rank or χ2 test.
The role of selected factors for treatment outcome was calculated with the uni- and multivariate Cox proportional hazard model. P-value for significance was ≤ 0.05.
Results
Part 1
Median follow-up was 55 months. None of the analyzed end-points (BC, bDFS) were statistically different between GG2 and GG3.
In a comparative analysis, the 5-year BC was 84% in the GG2 and 89% in the GG3 (p = 0.51). Five-year bDFS was 77% and 75%, respectively (p = 0.86).
To validate the obtained result, a multivariate analysis including other prognostic factors was undertaken. Factors which were included were as follows: age, T-stage, maximum pre-treatment PSA concentration, risk groups, hormonal treatment, and elective whole pelvic irradiation.
In a multivariate analysis, only the maximum pre-treatment PSA concentration (p = 0.019) and T-stage (p = 0.008) were significantly and independently associated with BC, whereas only the maximum pre-treatment PSA concentration was significant with regard to bDFS (p = 0.013). Grade Groups were of no significance with respect to these end-points.
Part 2
In the second part of the study, there were 139 pathology specimens which were eligible for re-evaluation. The distribution of Grade Groups between the original and re-evaluation diagnosis is presented in Table II.
The concordance of Grade Groups was only 15%. Even if both GG2 and GG3 groups were taken together as representing the Gleason score sum 7, in the majority of patients an up-grade towards Gleason score 8 or higher was observed (96 patients-69%). A down-grade was observed rarely. Among all the re-evaluated specimens, perineural invasion was noted in 38 patients (27%) and cribriform pattern in 42 patients (30%). The median number of involved cores was 3 (range 1-15), and the median percentage of involvement 40% (range 4% to 100%).
All these factors were analyzed with regard to the treatment outcome – BC or bDFS. The results are presented in Table III. Among the analyzed factors only the Grade Group, only if categorized into GG 1-3 vs. GG 4-5, was of prognostic significance (p = 0.012). The cribriform pattern was of borderline significance (p = 0.062). Results are presented in Figs. 1 and 2.
Discussion
The results achieved in the first part of the study suggest no significant difference in the treatment outcome between GG2 and GG3. This is in contrast to observations indicating such differences which created the background of the ISUP 2014 system [1]. However, at this point many issues should be discussed. The main study, which was the background for the current ISUP grading system, reported such pronounced differences in a subgroup of 22 thousand patients treated with RP [1]. However, there was also quite a large group of 5 thousand patients treated with RT, for whom significant differences were also observed but much less pronounced. This suggested to us a possible hypothesis that the lack of differences in our own material may be due to a lower reliability in biopsy findings, on which RT series are based. The rate of inadequate grading in a biopsy tissue with respect to the post-operative specimen exceeds 40% [8-10]. In intermediate differentiated cancer (GS7), Gleason up-grade may be expected in around 30% [9-11]. Additionally, it can be estimated that clinical change in risk groups which imply a change in treatment decision would occur in more than 30% of patients [8]. This should warn us against uncritical belief in a biopsy report. On the other hand, the aforementioned study by Epstein et al. [1] also included the analysis of Grading Groups based on a pre-prostatectomy biopsy and the results were similar to those on post-prostatectomy specimens. So, maybe some factors associated with RT may play a role? Contrary to many radiotherapy centers in North America or Europe, patients in Poland usually have a hormonal treatment implemented by urologists before referral for RT. Eighty-eight percent of patients had been on hormonal treatment before RT, whereas in general practice, as reported in the literature, it was around 50-60% in the combined intermediate and high-risk groups at a similar time period to our study [12, 13, 14]. In the present study hormonal treatment was more frequently prescribed in GG3 than in the GG2 group (96% vs. 82%); also in the former group more frequently pelvic lymph nodes were irradiated (68% vs. 50%). It should be stressed that the combination of neoadjuvant HT and whole pelvic RT was the method which significantly improved the treatment outcome in the NRG/RTOG 9413 study [15]. On the other hand, in a study on stereotactic body radiotherapy (SBRT), patients were not allowed to have hormonal treatment and pelvic irradiation, and in that study considerable differences were observed between GG2 and GG3 [2].
The present study has some flaws. Firstly, a 5-year follow-up is rather too short for prostate cancer patients to draw definitive conclusions. Secondly, the retrospective character of the study is associated with typical limitations. However, a relatively uniform treatment performed in one center in a narrow time frame should at least in some part reduce such limitations. Finally, the lack of central revision of biopsy specimens may be a weak point in such analysis. It is evident that a second opinion is associated with disagreement in the Gleason score, which ranges between 15% and 38%, not only in a biopsy but also in a post-prostatectomy specimen [16, 17, 18].
As the majority of studies on grading groups which show pronounced differences between these groups are based on a contemporary manner of pathologic evaluation and reporting, we considered that it may be one of the most important factors influencing our results. Therefore, re-evaluation of the original biopsy specimens was planned.
The re-evaluation revealed considerable differences in grading groups, as compared to the original report. These differences outranged our predictions. According to the literature we expected around 30%-50% discordance in reporting the Gleason score, whereas in the present study almost 70% of patients were up-graded to a Gleason score higher than the sum of 7. Probably in a contemporary era some uncertainties will still be unavoidable, as discordance in Gleason scoring may reach 50% between pathologists [19, 20]. Even in the same institution the difference in Gleason scoring was 40% in one study [21]. However, concordance may be improved if pathologists are experienced in the field of urology [20].
In our opinion, the large grade migration between original and re-evaluation reports observed in our study, despite the subjectivity of assessment, was also related to the changing rules of Gleason score evaluation throughout the last two decades. Many research studies had a second opinion re-evaluation done more or less at the same time, whereas in our study it was done around or even more than 10 years after the initial diagnosis. Probably some older reports in our study might not even be in line with 2005 ISUP rules, as we had observed in clinical practice. So, the explanation may be supported by the fact of a general tendency to report a higher Gleason score nowadays as compared to in the past [22]. The latter fact should also be taken into consideration when comparing contemporary results with past series.
Last but not least, there is the fact of the increasing experience of pathologists with a number of evaluated specimens. Looking at the growth in patients diagnosed with prostate cancer in Poland between 2005 and 2018 – from 7000 to 16 500, respectively – it became obvious that the experience of urologic pathologists had also increased. Hence, the pathologic re-evaluation performed by them presently is more relevant than those reports in past years and/or performed by less experienced pathologists.
In general, among the original GG2 and GG3 group only a small number of patients remained with a diagnosis of GG2-3 after contemporary re-evaluation. Hence, we were not able to validate the hypothesis of any potential differences between these groups. Our final observation was that we did not note recurrences in GG 1-3 as opposed to 14% in GG4-5. So, this confirms a more serious prognosis in the latter groups which reached significance even in such a small patient population. Taking into consideration the very low number of patients in GG 1-3 and the relatively short follow-up as for prostate cancer, we cannot suggest that these patients are in a group with cancer of a low clinical significance. Probably with a larger number of patients some failures would occur.
We would also stress the importance of other factors from pathology reports, such as cribriform pattern, which almost tripled the risk of a biochemical recurrence in our study. This factor was the only one which was significant (marginally) in the present study, but it did not translate into bDFS. Factors other than the Gleason score, such as cribriform pattern and/or intraductal carcinoma or perineural invasion, may have an impact on treatment outcome [23, 24, 25]. According to contemporary rules, these should also be enclosed in a pathology report [26].
To sum up, the differences in the original and re-evaluation pathology reports in our study are multifactorial. We are of the opinion that it is valuable to have a “second opinion” by urologic pathologists, especially in borderline or questionable cases. In general practice a review may change the treatment decision in around 10% to 30% of cases [8, 27]. Furthermore, from a clinical point of view, it would be of importance to also take into consideration other prognostic factors from the pathology report which are not included in the classical prognostic groups.
Conclusions
Re-evaluation and verification of a pathology specimen according to contemporary guidelines up-graded the Gleason score in the majority of patients. The aggressive behavior of prostate cancer starts to occur from GG4. Cribriform pattern almost tripled the biochemical failure rate.
The study was part of the MILESTONE project supported by the grant STRATEGMED2/267398/4/NCBR/2015.
The authors declare no conflict of interest.
References
1. Epstein JI, Egevad L, Amin MB, et al. The 2014 International Society of Urological Pathology (ISUP) Consensus Conference on Gleason Grading of Prostatic Carcinoma: definition of grading patterns and proposal for a new grading system. Am J Surg Pathol 2016; 40: 244-252
2.
Meier R, Kaplan I, Bloch D. et al. 10-year outcome of ultrahypofractionated stereotactic RT from two multicenter prostate cancer trials. (abstract). Radiother Oncol 2021; 161 (Suppl 1): S391-S392.
3.
Pompe RS, Davis-Bondarenko H, Zaffuto E, et al. Population-based validation of the 2014 ISUP Gleason grade groups in patients treated with radical prostatectomy, brachytherapy, external beam radiation or no local treatment. Prostate 2017; 77: 686-693.
4.
Spratt DE, Cole AI, Palapattu GS, et al. Independent surgical validation of the new prostate grade-grouping system. BJU Int 2016; 118: 763-769.
5.
Spratt DE, Jackson WC, Abugharib A, et al. Independent validation of the prognostic capacity of the ISUP prostate cancer grade grouping system for radiation treated patients with long-term follow-up. Prostate Cancer Prostatic Dis 2016; 19: 292-297.
6.
Stock RG, Berkowitz S, Backburg SR, et al. Gleason 7 prostate cancer treated with low-dose-rate brachytherapy: lack of impact of primary Gleason pattern on biochemical failure. BJU Int 2012; 110: 1257-1261.
7.
Toby J, Eade T, Hruby G, et al. Assessing the International society of urological pathology (ISUP) prostate cancer grade groups in patients treated with definitive dose escalated external beam radiation. Radiother Oncol 2021; 162: 91-97.
8.
Alshak MN, Patel N, Gross MD, et al. Persistent discordance in grade, stage and NCCN risk stratification in men undergoing targeted biopsy and radical prostatectomy. Urology 2020; 135: 117-123.
9.
Cohen MS, Hanley RS, Kurteva T, et al. Comparing the Gleason prostate biopsy and Gleason postprostatectomy grading system: the Lahey Clinic Medical Center experience and an international meta-analysis. Eur Urol 2008; 54: 371-378.
10.
Moussa AS, Li J, Soriano M, et al. Prostate biopsy clinical and pathological variables that predict significant grading changes in patients with intermediate and high grade prostate cancer. BJU Int 2009; 103: 43-48.
11.
Keefe DT, Schieda N, El Hallani S, et al. Cribriform morphology predicts upstaging after radical prostatectomy in patients with Gleason score 3+4=7 prostate cancer at transrectal ultrasound (TRUS)-guided needle biopsy. Virchows Arch 2015; 467: 437-442.
12.
Kok D, Gill S, Bressel M, et al. Late toxicity and biochemical control in 554 prostate cancer patients treated with and without dose escalated image guided radiotherapy. Radiother Oncol 2013; 107: 140-146.
13.
Tamihardja J, Schortmann M, Lawrenz I, et al. Moderately hypofractionated radiotherapy for localized prostate cancer: updated long-term outcome and toxicity analysis. Strahlenther Oncol 2021; 197: 124-132.
14.
Zumsteg ZS, Spratt DE, Romesser PB, et al. The natural history and predictors of outcome following biochemical relapse in the dose escalation era for prostate cancer patients undergoing definitive external beam radiotherapy. Eur Urol 2015; 67: 1009-1016.
15.
Roach M, Moughan J, Lawton CAF, et al. Sequence of hormonal therapy and radiotherapy field size in unfavourable, localised prostate cancer (NRG/RTOG 9413): long-term results of a randomized phase 3 trial. Lancet Oncol 2018; 19: 1504-1515.
16.
Brimo F, Schultz L, Epstein JI. The value of mandatory second opinion pathology review of prostate needle biopsy interpretation before radical prostatectomy. J Urol 2010; 184: 126-130.
17.
Maehara T, Sadahira T, Maruyama Y, et al. A second opinion pathology review improves the diagnostic concordance between prostate cancer biopsy and radical prostatectomy specimens. Urol Ann 2021; 13: 119-124.
18.
Netto GJ, Eisenberger M, Epstein JI. Interobserver variability in histologic evaluation of radical prostatectomy between central and local pathologists: findings of TAX 3501 multinational clinical trial. Urology 2011; 77: 1155-1160.
19.
Ozkan TA, Eruyagar AT, Cebeci OO, et al. Interobserver variability in Gleason histological grading of prostate cancer. Scand J Urol 2016; 50: 420-424.
20.
Oyama T, Allsbrook jr WC, Kurokawa K, et al. A comparison of interobserver reproducibility of Gleason grading of prostatic carcinoma in Japan and the United States. Arch Pathol Lab Med 2005; 129: 1004-1010.
21.
Coard KC, Freeman VL. Gleason grading of prostate cancer: level of concordance between pathologists at the University Hospital of the West Indies. Am J Clin Pthol 2004; 122: 373-376.
22.
Boehm K, Borgmann H, Ebert T, et al. Stage and grade migration in prostate cancer treated with radical prostatectomy in a large German multicenter cohort. Clin Genitourin Cancer 2021; 19: 162-166.
23.
Kweldam CF, Kummerlin IP, Nieboer D, et al. Presence of invasive cribriform or intraductal growth at biopsy outperforms percentage grade 4 in predicting outcome of Gleason score 3+4=7 prostate cancer Mod Pathol 2017; 30: 1126-1132.
24.
van der Slot MA, Hollemans E, den Bakjer MA, et al. Inter-observer variability of cribriform architecture and percent Gleason pattern 4 in prostate cancer: relation to clinical outcome. Virchows Arch 2021; 478: 249-256.
25.
Wu S, Xueming L, Lin SX, et al. Impact of biopsy perineural invasion on the outcomes of patients who underwent radical prostatectomy: a systematic review and meta-analysis. Scand J Urol 2019; 53: 287-294.
26.
Kweldam CF, van Leenders GJ, van der Kwast T. Grading of prostate cancer: a work in progress. Histopathology 2019; 74: 146-160.
27.
Nguyen PL, Schultz D, Renshaw AA, et al. The impact of pathology review on treatment recommendations for patients with adenocarcinoma of the prostate. Urol Oncol 2004; 22: 295-299.