- Roel G.W. Verhaak1,2,
- Bas J. Wouters1,
- Claudia A.J. Erpelinck1,
- Saman Abbas1,
- H. Berna Beverloo3,
- Sanne Lugthart1,
- Bob Löwenberg1,
- Ruud Delwel1 and
- Peter J.M. Valk1⇓
- 1 Department of Hematology, Erasmus University Medical Center, Rotterdam, The Netherlands
- 2 Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA; The Broad Institute of M.I.T. and Harvard, Cambridge, USA
- 3 Department of Clinical Genetics, Erasmus University Medical Center, Rotterdam, The Netherlands
- Correspondence: Peter J.M. Valk, Erasmus University Medical Center Rotterdam, Department of Hematology, Ee1391a, Dr. Molewaterplein 50, 3015 GE Rotterdam Z-H, The Netherlands. E-mail:
We examined the gene expression profiles of two independent cohorts of patients with acute myeloid leukemia [n=247 and n=214 (younger than or equal to 60 years)] to study the applicability of gene expression profiling as a single assay in prediction of acute myeloid leukemia-specific molecular subtypes. The favorable cytogenetic acute myeloid leukemia subtypes, i.e., acute myeloid leukemia with t(8;21), t(15;17) or inv(16), were predicted with maximum accuracy (positive and negative predictive value: 100%). Mutations in NPM1 and CEBPA were predicted less accurately (positive predictive value: 66% and 100%, and negative predictive value: 99% and 97% respectively). Various other characteristic molecular acute myeloid leukemia subtypes, i.e., mutant FLT3 and RAS, abnormalities involving 11q23, −5/5q-, −7/7q-, abnormalities involving 3q (abn3q) and t(9;22), could not be correctly predicted using gene expression profiling. In conclusion, gene expression profiling allows accurate prediction of certain acute myeloid leukemia subtypes, e.g. those characterized by expression of chimeric transcription factors. However, detection of mutations affecting signaling molecules and numerical abnormalities still requires alternative molecular methods.
Acute myeloid leukemia (AML) is not a singledisease but a group of neoplasms with various genetic abnormalities and variable responses to treatment. The pre-treatment karyotype is still essential in therapy decision-making in AML.1–3 In recent years, a number of novel molecular markers have been associated with AML prognostics.2,3 Several attempts have been made to investigate whether genome-wide GEP could be valuable for prediction of certain subtypes of AML.4–12 Although there was concordance in predictive signatures in the various studies, none of those studies validated the derived signatures to predict the recurrent molecular markers using independent representative AML cohorts. The question, therefore, remains whether GEP could substitute current diagnostic techniques and could be applied as a reliable single test to simultaneously detect known cytogenetic and molecular abnormalities. The aim of this study was to validate GEP as preferred single assay to predict prognostically relevant AML subtypes using two large independent cohorts of young adults with AML.
Design and Methods
Bone marrow aspirates or peripheral blood samples of two independent representative cohorts of de novo AML patients (lower or equal than 60 years), consisting of 247 and 214 patients, were collected (Table 1). The first cohort represents a subset of 285 patients previously studied,8 while the second cohort has not yet been described.
Blast cell purification and RNA isolation were carried out as previously described.8 All samples were analyzed using Affymetrix Human Genome U133Plus2.0 GeneChips (Affymetrix, Santa Clara, CA, USA). Labeling, hybridization, scanning and data normalization were performed as previously described.8 The variation between the scaling/normalization factors of the GeneChips in both cohorts was less than 3-fold [cohort1: 0.53(±0.15); cohort2: 0.73(±0.20)]. Also, the percentage of genes present [cohort1: 39.1(±3.1); cohort2: 40.6(±3.7)], GAPDH 3′/5′ ratio [cohort1: 1.07(±0,13); cohort2: 1.08(±0.16)] and actin 3′/5′ ratio [cohort1: 1.26(±0.21); cohort2: 1.33(±0.29)] were indicative for high overall quality and consistency between both AML sample populations. Mutational analyses to detect recurrent mutations in AML were performed as previously described.13–16 All supervised class prediction analyses were performed with Prediction Analysis for Microarrays (PAM) software version 1.28 in R version 184.108.40.206
Clinical, cytogenetic and molecular information as well as the gene expression profiles of all primary AML cases is available at the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo, accession number GSE6891).
Results and Discussion
In this study of 461 clinically and molecularly well-characterized cases of AML (Table 1), we were able to comprehensively validate the application of GEP to predict therapeutically relevant molecular subtypes in AML.
We applied PAM to investigate whether karyotypic and mutational abnormalities with prognostic or therapeutic value in AML were accurately predictable based on GEP. PAM allows the selection of the minimal number of genes required for optimal prediction, which may be beneficial in a diagnostic setting. The AML cohort1 (n=247) was used as training set to derive predictive signatures that were subsequently validated on AML cohort2 (n=214). The deduced expression signatures are available in the Online Supplementary Tables S1–18.
The cytogenetic status of all AML patients with favorable risk, i.e. those with t(8;21), t(15;17) or inv(16) abnormalities, was predicted with 100 percent accuracy (Table 2). In fact, among these predicted AML cases, there were cases with favorable cytogenetics that had previously been missed by routine cytogenetics (4 out of 37 inv(16) and 4 out of 25 t(15;17)). The presence of the translocation-related fusion transcripts in these specific cases was confirmed by real-time quantitative PCR. Thus, GEP is a reliable alternative to discriminate these three AML subtypes,2,3 which represent approximately 20% of all cases.2,3 Prediction of t(15;17) and inv(16) required only few genes, as seen previously.8 For the t(8;21) cases, 76 probe sets were needed to correctly classify all samples. However, as few as two probe sets, including one associated with the RUNX1T1 (ETO) gene, were sufficient to accurately classify all but one t(8;21) cases, which is also consistent with earlier studies8 (Online Supplementary Figure S3).
AML cases with mutations in the transcription factor CCAAT/enhancer binding protein α (CEBPA), which are associated with a relatively favorable treatment outcome, were predicted with positive and negative predictive values of 100% and 97% respectively. Six out of 15 CEBPA mutant cases were missed in the validation set (sensitivity 60%; Table 2). Of note, the misclassified cases all carried a single heterozygous CEBPA mutation, whereas samples with biallelic mutations (either homo- or heterozygous) were all correctly recognized (data not shown). In the training cohort, all but two (14/16) samples carried biallelic mutations14,18 and in cross-validation in the training cohort the two heterozygous mutants were the only misclassified samples as well.
Previous work has shown that mutations in nucleophosmin (NPM1) are strongly associated with a discriminative HOX- and TALE gene-specific signature.16,19 In this study, AML cases carrying a NPM1 mutation were indeed recognized with high accuracy based on such a signature (Table 2 and Online Supplementary Table S5). However, a relatively high number of AML cases without NPM1 mutations was incorrectly predicted positive (32 out of 151), suggesting the presence of genetic alterations resulting in a similar upregulation of the HOX- and TALE genes in those cases. Among these false positives were several AMLs carrying 11q23 abnormalities, which is in line with the role of the mixed lineage leukemia (MLL) protein as an important regulator of HOX gene expression.16 Of note, all t(6;9) AML cases in the training and validation cohort (n=6) were predicted to also carry an NPM1 mutation, raising the possibility that the DEK-CAN fusion protein also induces HOX-related gene expression. Interestingly, prediction of t(6;9) translocation was partly feasible using a unique signature (Table2 and Online Supplementary Table S14), although these results are based on a relatively low number of cases.
NPM1 mutations are associated with relatively favorable survival parameters in patients with a normal karyotype and standard risk AML.16,20–22 The favorable risk is particularly associated with AMLs lacking internal tandem duplications (ITD) in the fms-related tyrosine kinase (FLT3) gene.16,20–22 Analyses of AML subsets defined by combined presence or absence of NPM1 and FLT3 ITD abnormalities demonstrated that only patients carrying both mutations could be moderately predicted, whereas the remaining subtypes could not be discriminated (Table 2). Restriction of these analyses to normal karyotype cases only did not result in a significant improvement in prediction accuracy (Online Supplementary Table S19). Of note, prediction of NPM1 mutation in preselected normal karyotype samples led to a slightly increased positive predictive value (83 vs. 66%), which may be consistent with the lack of interfering 11q23 positive samples. The remaining cytogenetic and molecular subgroups we studied were not associated with strong predictive signatures. Whereas the positive predictive value for FLT3 ITD aberrations was relatively high (77%), the high number of false predictions eliminates GEP, with the currently available analyses tools, as a reliable test to determine the FLT3 ITD status. Restriction to the normal karyotype group did not lead to a marked improvement (Online Supplementary Table S19). Likewise, the low positive predictive values for FLT3 tyrosine kinase domain (TKD) or RAS mutations, abnormalities involving 11q23, −5/5q-, −7/7q- and abn3q, and the translocation t(9;22), disqualify GEP as single detection method for these abnormalities. Similarly, 3q aberrations were not readily predictable. Nevertheless, the most discriminative gene for abn3q abnormalities was the oncogenic transcription factor ecotropic viral integration site1 (EVI1) (Online Supplementary Table S15), which is frequently involved in 3q26 abnormalities. Of note, in these predictions we included the cases carrying a cryptic abn3q recently identified by gene expression analyses and fluorescence in situ hybridization.23
Classifiers were also deduced using a number of other approaches, i.e. compound covariate predictor, linear discriminant analysis, 1-nearest neighbor and 3-nearest neighbors, nearest centroid and support vector machines (probe set selection at 0.001 significance level). These alternative analyses were carried out in BRB-ArrayTools, version 3.7.0 β2 release, developed by Dr. Richard Simon and Amy Peng Lam. Overall, this comparative analysis yielded highly similar results, i.e. the favorable cytogenetic subclasses were predictable with (close to) 100% accuracy, whereas other subtypes showed a similar prediction pattern as depicted in Table 2 (data not shown). One exception was NPM1 mutation status, for which prediction accuracy was better using an approach based on support vector machines (positive predictive value 91% with a negative predictive value of 99%). Several general causes for the inability to predict specific recurrent abnormalities could apply: (i) if different recurrent genetic aberrations affect similar pathways, their GEP signatures may overlap; (ii) mutations affecting signaling pathways may not result in strong discriminative mRNA expression signatures; (iii) the expression of differentiation-related genes may affect accurate prediction; (iv) secondary mutations, or biallelic versus monoallelic mutations as in the case of CEBPA, may prohibit reliable prediction. More specifically, (v) the various partners of the MLL gene may affect reliable prediction of 11q23 abnormalities, and (vi) the numerical changes in (part of) the chromosomes 5 and 7 may only result in minor changes in gene expression that are insufficient for GEP prediction. Of note, still almost all discriminative genes with decreased expression in the deduced signature for 7(q) abnormalities were located on chromosome 7, including FASTK, GSTK1, LSM8 andZNF746 (Online Supplementary Table S17).
Altogether, we conclude that AML cases with favorable cytogenetics are predictable with high accuracy with the currently available genome-wide gene expression technology and analyses tools. All other prognostically and therapeutically known abnormalities in AML still require additional molecular methods for detection.
RGWV and BJW contributed equally to this manuscript. Funding: this work was supported by grants from the Dutch Cancer Society (Koningin Wilhelmina Fonds) and the Erasmus University Medical Center (Revolving Fund). We are indebted to Gert J. Ossenkoppele, M.D. (Free University Medical Center, Amsterdam, The Netherlands), Jaap Jan Zwaginga M.D. (Sanquin, The Netherlands), Edo Vellenga, M.D. (University Hospital, Groningen, The Netherlands), Leo F. Verdonck, M.D. (University Hospital, Utrecht, The Netherlands), Gregor Verhoef, M.D. (Hospital Gasthuisberg, Leuven, Belgium) and Matthias Theobald, M.D. (Johannes Gutenberg-University Hospital, Mainz, Germany) who provided us with AML samples; to our colleagues from the bone marrow transplantation group and molecular diagnostics laboratory for storage of the samples and molecular analyses respectively.
The online version of this article contains a supplementary appendix.
Authorship and Disclosures
RGWV: performed research, analyses and wrote manuscript; BJW: performed research, analyses and wrote manuscript; CAJE: performed research; SA: performed research; HBB: performed research; SL: performed research; BC: designed research and wrote manuscript; RD: designed research and wrote manuscript; PJMV: performed research and analyses, designed research and wrote manuscript. The authors reported no potential conflicts of interest.
- Received May 2, 2008.
- Revision received July 1, 2008.
- Accepted August 7, 2008.
- Copyright© 2009 Ferrata Storti Foundation