Chronic Lymphocytic Leukemia |
1 Department of Internal Medicine III, University of Ulm, Germany
2 Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
3 Department of Internal Medicine I, Medical University of Vienna, Vienna, Austria
4 Division of Molecular Genetics, DKFZ Heidelberg, Germany
5 Institute for Cancer Genetics, Department of Pathology and Genetics and Development, Columbia University, NY, USA
Correspondence: Stephan Stilgenbauer, Department of Internal Medicine III, University of Ulm, Albert-Einstein-Allee 23, 89081 Ulm, Germany. E-mail: stephan.stilgenbauer{at}uniklinikulm.de
|
|
|---|
Design and Methods: Expression markers were evaluated using real-time quantitative reverse transcriptase polymerase chain reaction in CD19+-purified samples from 151 patients. Multivariate analyses were performed to test the markers ability to identify patients at genetic risk and as prognostic markers in the context of established prognostic factors.
Results: For individual markers, ZAP70 expression provided the highest rate (81%) of correct assignment of patients at genetic risk (IGHV unmutated, V3-21 usage, 11q- or 17p-), followed by LPL and TCF7 (76% both). The assignment rate was improved to 88% by information from a four-gene combination (ZAP70, TCF7, DMD, ATM). In multivariate analysis of treatment-free survival, IGHV mutation status and expression of ADAM29 were of independent prognostic value besides disease stage. With regards to overall survival, expression of ATM, ADAM29, TCL1, and SEPT10 provided prognostic information in addition to that derived from clinical and genetic factors.
Conclusions: Gene expression markers are suitable for screening but not as surrogates for the information from genetic risk factors. While many individual markers may be associated with outcome, only a few are of independent prognostic significance. With regard to prognosis estimation, the genetic prognostic factors cannot be replaced by the expression markers.
Key words: chronic lymphocytic leukemia, markers, quantitative RT-PCR.
|
|
|---|
Based on these findings, gene expression parameters have been investigated for their association with genetic subgroups of CLL to reveal biological mechanisms and to identify potential surrogate markers for prognostic assessment. ZAP70, a
associated tyrosine kinase, has been broadly studied and was found to be a surrogate marker for unmutated IGHV status and for poor outcome.10–13 However, there is discordance between ZAP70 expression and IGHV status in about 10 to 25% of cases of CLL. Discordance rates appear to be higher in specific genetic subgroups such as those using V3-21, or with 11q- or 17p-.14,15 Additional potential surrogate markers for IGHV status have been suggested based on global gene expression studies.16–18 Among these, lipoprotein lipase (LPL) showed promising results with regard to estimation of the IGHV mutation status and survival in purified19 as well as in unpurified tumor samples.20,21 Furthermore, a number of other individual markers showed an association with genetic subgroups, clinical course, or the pathogenesis of the disease.22–29
However, systematic comparative analysis is lacking since most of the studies focused on single markers or were based on small and heterogeneous cohorts of patients with incomplete genetic profiles. The aim of the present study was, therefore, to investigate the value of a broad range of novel and established surrogate markers, namely ADAM29, ATM, CLLU1, DMD, GLO1, HCSL1, KIAA0977, LPL, MGC9913, PCDH9, PEG10, SEPT10, TCF7, TCL1, TP53, VIM, ZAP70, and ZNF2, for their ability to predict the genetic risk of patients (defined by IGHV status, V3-21 usage, 11q-, and 17p-) and survival in multivariate analyses including established prognostic factors.
|
|
|---|
|
View this table: [in a new window] [Download PPT slide] |
Table 1. Patients clinical and genetic characteristics divided into CD19+ purified and unpurified cases. Absolute numbers and % values are shown.
|
Real-time quantitative reverse transcriptase polymerase chain reaction
RNA was prepared and the RQ-PCR performed as reported elsewhere.15 DNAse I digestion of total RNA was included to avoid contamination with genomic DNA. The TaqMan method (primers and probe) was used for quantification of all genes except for ADAM29, MGC9913, and PCDH9, for which the SYBR Green method was used. The primers for, and characteristics of, the candidate genes are listed in Online Supplementary Table S1. GAPDH was used as an endogenous control. Three peripheral blood samples from healthy donors were used, after CD19+ purification, for standardization.
To test for gene expression levels in non-B cells of peripheral blood, the results for CD19-negative fractions from four CLL patients and three healthy donors were compared to those of the the respective positive fractions (Online Supplementary Table S1).
ZAP70 expression analysis by flow cytometry
For 72 cases included in the CD19+ cohort, ZAP70 expression was measured by four-color flow cytometry (CD5, CD19, CD3/56, ZAP70) according to Crespo et al.,11 as described previously.14 Positivity was defined as a level greater than 20%.11
Statistical analysis
For the attribution of genetic risk, a high risk group (including all patients with an unmutated IGHV status or V3-21 usage or 11q- or 17p-) and a low risk group (IGHV mutated without usage of V3-21 and no 11q- or 17p-) were defined. Prediction of IGHV mutation status and genetic risk group stratification were performed with binary and multinomial logistic regression analysis including the expression levels of all genes. To assess the prediction error of the resulting predictor, ten repetitions of 10-fold cross-validation were used. Multivariate Cox proportional hazards models were used including the expression levels of all genes, Binet stage, age, and the genetic risk groups (IGHV status, V3-21 usage, 11q-, 17p-) for the analysis of overall survival and treatment-free survival times. Backward selection using Akakas information criterion (AIC) was applied to exclude redundant or unnecessary variables. For purposes of comparison, a model based on the gene expression factors alone was calculated. To evaluate the prediction accuracy of the two models (the full model including all variables and the model including gene expression factors only), prediction errors over time were calculated using the loss function approach described elsewhere.33,34 A measure of explained variation is derived by comparing the integrated prediction errors with the benchmark prediction error of survival prediction derived from Kaplan-Meier estimates. Kaplan-Meier estimates were used to compute marginal survival curves. Error estimation was done using ten repetitions of 10-fold cross-validation. Survival curves for censored data were estimated according to Kaplan and Meier. An effect was considered statistically significant when (adjusted) P values were less than 5%. All statistical computations were performed with R, version 2.7.0, together with R packages multtest, version 1.20.0, pec, version 1.0.7, nnet, version 7.2-41, and Design, 2.1-1.35
|
|
|---|
Assignment of genetic risk
Assignment of IGHV mutation status was tested using logistic regression analysis. Best classifications among the candidate genes were obtained for LPL, ZAP70, and TCF7 (data not shown). Correct assignment of the IGHV status was achieved in 83% of all cases when using LPL or ZAP70 for classification, and in 75% when using TCF7 (83%, 82%, and 73%, respectively, when assessed by ten repetitions of 10-fold cross-validation).
In addition to an unmutated IGHV status, V3-21 usage and deletions at 11q or 17p define poor risk subsets. Accordingly, genetic risk was defined by assigning all patients with an unmutated IGHV status, V3-21 usage, 11q- or 17p- to a high risk group, and patients with mutated IGHV without usage of V3-21, 11q- or 17p- to the low risk group (two-group risk model). According to this model, ZAP70 provided the highest rate of correct classifications (81% of all cases), followed by TCF7 and LPL (76% both) (Figure 1A–C) (81%, 76%, and 75%, respectively, after 10-fold cross-validation). Specifically, recognition of IGHV mutated V3-21-using cases as poor risk subset contributed to the superiority of ZAP70 compared to LPL (Figure 1A–B): the majority of cases with V3-21 gene usage showed high levels of ZAP70 expression, including four of six V3-21-using cases with mutated IGHV. These cases were classified false positive by ZAP70 with regard to the IGHV mutation status but were correctly classified as being at high clinical risk (due to V3-21 usage). In contrast, LPL expression levels correctly predicted IGHV mutation status in seven of the nine cases using V3-21, i.e. mutated cases had predominantly low and unmutated cases high LPL expression and accordingly did not identify V3-21 usage as a marker of high clinical risk.
![]() View larger version (19K): [in a new window] [Download PPT slide] |
Figure 1. Patient distribution (n=151) according to marker expression and IGHV homology for ZAP70 (A, E), LPL (B, F), TCF7 (C, G), and the four-gene combination (D, H). Each circle represents one case. Y-axis: gene expression (logarithmic scale), X-axis: IGHV homology in %. Vertical line: 98% cut-off for the separation of the IGHV mutation subgroups. Horizontal line: gene expression cut-off for the separation of the high risk and low risk groups according to the logistic regression model prediction for the two-group risk model. A-D: all cases, V3-21 using cases indicated by filled circles. False allocations are given in % of the total cohort. E-H: all cases, 17p- and 11q- cases highlighted as shown.
|
The rates of discordance between ZAP70 expression levels and IGHV mutation status have been reported to be higher for patients with 11q- and 17p-.14 The relation between IGHV homology and gene expression within the 11q- and 17p- subgroups is detailed in Figure 1E–H for ZAP70, LPL, TCF7, and the four-gene combination according to the two-group risk model. The frequency of IGHV discordances within the 11q- and 17p- subsets was very low for ZAP70, LPL, and TCF7 (3 of 24 patients with 11q-, 1 of 18 patients with 17p-), and even lower for the four-gene combination (2 of 24 patients with 11q-, 0 of 18 patients with 17p-).
Classification of the IGHV mutation status by ZAP70: polymerase chain reaction versus flow cytometry
For 72 cases of the purified cohort, ZAP70 protein expression was determined by flow cytometry, as described previously.14 These results could be compared with transcript levels determined by RQ-PCR. Concordant results between flow cytometry and RQ-PCR were observed in 50 (69.4%) of the cases, while the results were discordant in 22 cases (30.6%). The discordant cases are listed in Table 2. Four of these cases showed high (i.e. false positive) mRNA levels by RQ-PCR in IGHV mutated cases including one case using the V3-21 gene. Another two were IGHV mutated cases with high (i.e. false positive) levels by flow cytometry. However, the majority of discordances (n=16) was related to IGHV unmutated cases with high ZAP70 levels by RQ-PCR but low levels (i.e. false negative) by flow cytometry.
|
View this table: [in a new window] [Download PPT slide] |
Table 2. Characteristics of cases with discordance of ZAP70 expression as assessed by RQ-PCR and flow cytometry. Sorted by IGHV homology.
|
Prediction of treatment-free and overall survival
The prognostic value of the expression markers with regards to treatment-free survival and overall survival was studied using multivariate Cox regression analyses including the following variables: expression levels of all 14 candidate genes, clinical factors (age, stage), and the genetic factors (IGHV mutation status, V3-21 usage, 11q-, and 17p-) (full model; Table 3). The best estimation of treatment-free survival was achieved by a model including the variables ADAM29, IGHV mutation status, and Binet stage. Regarding the prediction of overall survival, a combined model consisting of clinical, genetic and gene expression markers was identified (Table 3). The most significant factors in this analysis were 17p-, Binet stage, and ATM expression. IGHV mutation status was not of significance in this model.
|
View this table: [in a new window] [Download PPT slide] |
Table 3. Multivariate Cox regression analysis for treatment-free survival and overall survival including gene expression markers, clinical and genetic parameters (full model).
|
|
View this table: [in a new window] [Download PPT slide] |
Table 4. Multivariate Cox regression analysis for treatment-free survival and overall survival with gene expression markers as the only included variables (restricted model, excluding clinical and genetic factors).
|
![]() View larger version (17K): [in a new window] [Download PPT slide] |
Figure 2. Prediction error curves for treatment-free survival (A) and overall survival (B) according to the restricted and full prognostic models. Restricted model (red): based on gene expression variables only; full model (green): including clinical and genetic factors; reference curve (gray): Kaplan-Meier estimation without additional variables. Curves below the reference curve indicate models with reduced prediction errors corresponding to higher explained variation values (% given in brackets for both models) and, therefore, higher prediction accuracy.
|
|
|
|---|
Classification of the IGVH mutation status among patients with 11q- or 17p- is of special interest, since increased misclassification of IGHV status was described within these subsets when using ZAP-70 determined by flow cytometry for classification.14 However, in the present study, misclassification of IGHV status occurred infrequently within these subsets when using gene expression levels of ZAP70, as well as of LPL or TCF7, for classification (<10% of patients with 11q-or 17p-). Importantly, misclassification of patients with 11q-or 17p- occurred very rarely when using the four-marker combination (11q-: 2 of 24, 17p-: 0 out of 18) for IGHV assignment, underscoring the potential benefit of this classifier. However, although technically feasible, standardization of such a four-gene classifier would be challenging given the difficulties in standardization of the individual marker ZAP-70.37–39
Comparison of ZAP70 expression evaluated by FACS with gene expression levels revealed significant discordances between the results of the two methods (approximately 30%). The discordances were mainly IGHV unmutated cases being assigned false negative by FACS, pointing to a decreased sensitivity of the flow cytometric approach. This finding might reflect distinct biological properties of genetically high-risk CLL, such as post-translational down-regulation or enhanced protein degradation leading to reduced amounts of ZAP70 protein compared to mRNA. Alternatively, technical problems of flow cytometric detection might play a role such as usage of frozen samples or difficulties related to antibodies or procedures.14,37–39 The RQ-PCR-based approach might, therefore, offer a sensitive and reproducible alternative.40 However, the practicability of this approach is hampered by the need for cell purification prior to analysis.
Several of the investigated candidate genes have been proposed as novel prognostic factors in CLL.10–13,19–21,23,26,28 Since most of the markers were identified based on their association with IGHV status, correlation with survival in univariate analysis is not unexpected. In multivariate approaches, ZAP70 and LPL showed the potential to improve or substitute the information provided by IGHV status regarding treatment-free survival and overall survival.11,21,39,41 These studies were, however, restricted to a few selected markers and did not account for the prognostic impact of genomic abnormalities and V3-21 usage which were, in contrast, included in the present study. In this study, the information from the candidate genes was not able to replace that from IGHV status and disease stage with regards to predicting treatment-free survival, but expression of ADAM29 added independent prognostic information in the multivariate model. Therefore, ADAM29 expression may be used for refined prediction of disease progression. Interestingly, a model based on the expression of only three genes provided a similar prognostic accuracy as that of the full model, thus offering a simplified tool for estimation of treatment-free survival.
The most validated established factors for prediction of overall survival are disease stage, IGHV mutation status, and 17p-. Multivariate analysis of overall survival including clinical and genetic factors resulted in a combined model predictor consisting of gene expression, clinical variables and genetic variables. This combined model was clearly superior to models based on gene expression factors alone or genetic factors alone (data not shown). Gene expression factors are, therefore, able to improve the estimation of overall survival provided by already established factors. The gene expression factors of additional impact were ATM, ADAM29, TCL1, and SEPT10. The quantitative relation between reduced levels of ATM expression and inferior overall survival is a novel finding and strongly suggests that this gene has a pathogenic role in CLL.
A close association between genomic loss at 11q22-q23 and reduced ATM transcript levels has been described18,29 and can be interpreted as a gene dosage loss, supporting the postulated concept of ATM having a tumor suppressor function in CLL.42–44 Selection of ATM expression instead of genomic deletion at 11q22-q23 in multivariate analysis indicates that quantitative transcript levels might reflect ATM dysfunction more precisely. TCL1 over-expression in mice resulted in a disease resembling CLL, suggesting that TCL1 might be directly involved in CLL transformation.24 The prognostic impact of TCL1 expression points to an ongoing pathogenic influence of TCL1 deregulation during disease progression. SEPT10 was significantly over-expressed in IGHV-unmutated CLL, confirming the findings of Bilban et al.45 Extending those findings, Benedetti et al.46 reported low SEPT10 expression in V3-21 CLL, comparable to that in IGHV-mutated CLL patients. It, therefore, appears that SEPT10 expression might be able to substitute for the survival information derived from IGHV mutation status but not for the information derived from V3-21 usage, as indicated by the multivariate overall survival model. LPL and ZAP70 were not among the selected parameters for the overall survival model in line with previous findings that ZAP70 lost prognostic significance in multivariate analysis when genetic factors were included.14 Before the novel classifier can be recommended for further application, independent validation is required, which should best be performed within prospective clinical trials. The disadvantages of the classifier are its complexity, requiring analysis of a multitude of factors, and the need for cell purification prior to RQ-PCR analyses.
In conclusion, the novel gene expression markers are not a satisfactory surrogate for genetic risk factors but may be used for screening of genetic risk, which is best achieved by a marker combination. With regards to estimation of prognosis, the gene expression factors cannot replace established prognostic factors. However, a limited set of gene expression markers was of independent prognostic value and thus improved the prediction of treatment-free survival and overall survival. The potential value of the markers for future risk stratification strategies will depend on the clinical situation, as illustrated by the differential impact on treatment-free survival and overall survival and the influence of co-variates such as disease stage. The potential pathogenic implications of some genes such as TCL1, ATM, and TCF7 warrant further functional investigation.
The online version of this article has a supplementary appendix.
DKi: designed and performed research, collected, analyzed and interpreted data, wrote the manuscript; AB: performed statistical analyses; CL, DW and CS: performed research; AB, TZ and AH: performed research, collected data; UJ, PL and RD-F: designed research; HD and SS: designed research, collected, analyzed and interpreted data, wrote the manuscript.
The authors reported no potential conflicts of interest
Received for publication April 28, 2009. Revision received June 16, 2009. Accepted for publication July 6, 2009.
|
|
|---|
Related Article
This article has been cited by other articles:
![]() |
C. Moreno and E. Montserrat Genetic lesions in chronic lymphocytic leukemia: what's ready for prime time use? Haematologica, January 1, 2010; 95(1): 12 - 15. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||