- Melchior Lauten1,
- Anja Möricke2⇓,
- Rita Beier3,
- Martin Zimmermann3,
- Martin Stanulla2,
- Barbara Meissner2,
- Edelgard Odenwald3,
- Andishe Attarbaschi4,
- Charlotte Niemeyer5,
- Felix Niggli6,
- Hansjörg Riehm3 and
- Martin Schrappe2
- 1Pediatric Hematology and Oncology, University Hospital Schleswig-Holstein, Lübeck Campus, Germany
- 2Department of Pediatrics, University Hospital Schleswig-Holstein, Kiel Campus, Germany
- 3Pediatric Hematology and Oncology, Hannover Medical School, Hannover, Germany
- 4St. Anna Children’s Hospital, Vienna, Austria
- 5Pediatric Hematology and Oncology, University Hospital Freiburg, Germany
- 6Kinderspital, University Hospital Zürich, Switzerland
- Correspondence: Anja Möricke, MD, University Hospital Schleswig-Holstein, Campus Kiel, Department of Pediatrics Schwanenweg 20, 24105 Kiel, Germany. Phone: international +49.431.5974028. Fax: international +49.431.5974034. E-mail:
Background In the ALL-BFM 95 trial for treatment of acute lymphoblastic leukemia, response to a prednisone pre-phase (prednisone response) was used for risk stratification in combination with age and white blood cell count at diagnosis, response to induction therapy and specific genetic high-risk features.
Design and Methods Cytomorphological marrow response was prospectively assessed on Day 15 during induction, and its prognostic value was analyzed in 1,431 patients treated on ALL-BFM 95.
Results The 8-year probabilities of event-free survival were 86.1%, 74.5%, and 46.4% for patients with M1, M2, and M3 Day 15 marrows, respectively. Compared to prednisone response, Day 15 marrow response was superior in outcome prediction in precursor B-cell and T-cell leukemia with, however, a differential effect depending on blast lineage. Outcome was poor in T-cell leukemia patients with prednisone poor-response independent of Day 15 marrow response, whereas among patients with prednisone good-response different risk groups could be identified by Day 15 marrow response. In contrast, prednisone response lost prognostic significance in precursor B-cell leukemia when stratified by Day 15 marrow response. Age and white blood cell count retained their independent prognostic effect.
Conclusions Selective addition of Day 15 marrow response to conventional stratification criteria applied on ALL-BFM 95 (currently in use in several countries as regular chemotherapy protocol for childhood acute lymphoblastic leukemia) may significantly improve risk-adapted treatment delivery. Even though cutting-edge trial risk stratification is meanwhile dominated by minimal residual disease evaluation, an improved conventional risk assessment, as presented here, could be of great importance to countries that lack the technical and/or financial resources associated with the application of minimal residual disease analysis.
Early reduction of malignant cell load is known to be of major importance for the prediction of treatment outcome in solid and hematologic tumors.1–4 In 1983, the Berlin-Frankfurt-Münster (BFM) study group started to evaluate the early treatment response to a prednisone pre-phase (prednisone response, PR) as a predictive factor for treatment outcome by measuring the peripheral blast count on Day 8 of treatment.5 Since then, the PR has consistently been found to be one of the strongest independent prognostic factors for the prediction of treatment outcome in ALL-BFM studies.6
In the 1970s, the Children’s Cancer Study Group (CCG) started to evaluate early bone marrow response during multi-agent induction treatment and demonstrated the predictive value of early marrow response in terms of remission achievement and ultimate outcome.7–9 In the following trials, the CCG generated many data on the prognostic importance of marrow response on Days 7 and 14, the combined impact of the two evaluation points, and the differential effect in patients at standard or high risk according to the NCI/Rome criteria.10–13 Based on these results, early marrow response has become an integral part of risk stratification in the successive CCG and contemporary Children’s Oncology Group (COG) ALL treatment regimens.14–19 The St. Jude Total Therapy Study Group showed that even the persistence of low percentages (1–4%) of BM lymphoblasts on Day 15 (corresponding to Day 22 of the BFM protocol without prednisone prephase) and Days 22 to 25 of induction therapy was associated with a significantly poorer 5-year event free survival rate compared to patients without detectable BM blasts (40±6% vs. 78±2%).20
In the ALL-BFM 95 trial, cytomorphological response in BM on Day 15 (BMd15) was prospectively assessed without being used for risk stratification. In the present study, the prognostic value of BMd15 in ALL-BFM 95 was evaluated in comparison and combined with PR, cytomorphological BM response to induction therapy (Day 33), age and white blood cell count (WBC) at diagnosis; all factors included in the ALL-BFM 95 risk stratification. The aim was to refine the risk criteria used in ALL-BFM 95 without using modern minimal residual disease (MRD) techniques that might be not available in less affluent countries because of cost.
Design and Methods
From the 2,169 patients eligible for the ALL-BFM 95 study, 1,431 patients had assessable information on BM morphology on Day 15. These patients were included in the current study. Informed consent was obtained from the parents or guardians of each patient. Data were managed in the ALL-BFM study center. The trial was approved by the ethics committee of the Hanover Medical School, Germany. Treatment regimen and outcome of the ALL-BFM 95 trial have been recently described.21
Response and relapse criteria
PR was defined by the absolute number of leukemic blasts/μL in the peripheral blood after seven days of prednisone treatment and one intrathecal (IT) dose of methotrexate, regardless of the initial leukemic blast count. Prednisone good responders (PGR) were characterized by less than 1,000 blasts/μL, whereas prednisone poor responders (PPR) showed 1,000 blasts/μL or more on Day 8 of treatment.5 Response in BM was evaluated on Days 15 and 33 of induction treatment and was categorized as M1 (<5%), M2 (5 to <25%), and M3 (≥25% lymphoblasts). Complete remission (CR) was defined as M1 BM on Day 33 of induction therapy, the absence of leukemic blasts in blood and CSF, and no evidence of local disease. Relapse was defined as recurrence of 25% lymphoblasts or over in BM or local leukemic infiltrates at any site. Both PR and BM evaluation were reviewed centrally in two reference laboratories.
Patients were stratified into three risk groups according to the following criteria:
HR: PPR, and/or no CR on Day 33, and/or evidence of t(9;22) (or BCR/ABL), and/or evidence of t(4;11) (or MLL/AF4);
MR: No HR criteria, and initial WBC ≥20×109/L and/or age at diagnosis <1 or ≥6 years, and/or T-ALL;
SR: No HR criteria, and initial WBC <20×109/L and age at diagnosis ≥1 and <6 years, and no T-ALL.
CNS status was not a stratification criterion.
Event-free survival was defined as the time from diagnosis to the date of last follow up in complete remission or first event. Events were resistance to therapy (non-response), relapse, secondary malignant neoplasm (SMN) or death from any cause. Failure to achieve remission due to early death or non-response was considered as event at time zero. Patients lost to follow up were censored at the time of their withdrawal. The Kaplan-Meier method22 was used to estimate survival rates; differences were compared with the two-sided log rank test.23 Differences in the distribution of individual parameters among patient subsets were analyzed using the χ2 test for categorized variables. All P values were two-sided and P<0.05 was considered statistically significant. Cox’s proportional hazards model was used to obtain the estimates and the 95% confidence interval of the relative risk for prognostic factors.24 The results of the ALL-BFM 95 trial were updated in August 2008.
Statistical analyses were performed using the SAS statistical program (SAS-PC, Version 9.1, SAS Institute Inc., Cary, NC, USA, and IBM SPSS statistics, version 15).
BM puncture on Day 15 of induction therapy Protocol I was performed in 1,696 (78%) of the 2,169 patients of the ALL-BFM 95 trial. In 1,431 (84%) of the 1,696 BM punctures the BM smears were eligible for evaluation and could be included in the present study. Characteristics of these patients and of the patients who could not be included due to missing BMd15 data are shown in the Online Supplementary Table S1. Patients who were not included due to non-representative BMd15 had a lower rate of PPR, presented less often with hyperleukocytosis, were less often BCR/ABL positive and included a lower rate of high-risk patients compared to those patients included in the study. However, the rate of complete remission on Day 33 was higher in patients not included in the study (Online Supplementary Table S1).
The estimated probability of 8-year EFS (8y-pEFS) of all patients included in this study was 78.8±0.9%.
PR was evaluable in 1,419 of the 1,431 (99%) patients analyzed; 1,280 (90%) patients showed PGR, 139 (10%) patients were defined as PPR. The 8y-pEFS was 81.3±0.9% for patients with PGR and 55.1±3.7% for patients with PPR (P<0.01). BMd15 characterized three distinct risk groups. The 8y-pEFS of these groups was 86±1%, 74±2%, and 46±4% for the patients with M1, M2 and M3 marrow, respectively (Table 1). BM response on Day 33 (BMd33) could be assessed in 1,415 of 1,431 patients. Only 42 of these patients did not achieve BM remission on Day 33 (NRd33) and had an 8y-pEFS of 36.3±6.9%. Of these patients, 38 (90%) had an M3 and 4 patients an M2 BMd15. Among all patients with M3 BMd15, 8y-pEFS was 52.5±4.2% (n=146) for those patients who achieved complete cytomorphological remission (CR) by Day 33 and 25.4±7.2% (n=38) for those who did not (P<0.001).
The results of the BMd15 subgroups according to various patients’ characteristics are presented in Table 1. Poor response in BMd15 was significantly associated with TALL (P<0.001) and the known high-risk features were adolescent age (P<0.001), hyperleukocytosis (P<0.001), BCR/ABL (P=0.003), CNS involvement (P=0.005), PPR (P<0.001), and NRd33 (P<0.001). A significant prediction of prognosis by BMd15 could be seen for all subgroups analyzed except for the small subgroup of patients with initial CNS involvement.
The cut-off values characterizing M1, M2 and M3 for the distinction of BMd15 subgroups are internationally recognized. However, each subgroup includes patients with a wide range of BM blasts. Therefore, in addition to the traditional M1, M2 and M3 categories, we analyzed patients within narrower ranges of blasts. Results are shown in Figure 1 and indicate that the steady increase in BMd15 blasts proceeds parallel to a steady decrease in 8y-pEFS. Interestingly, there is a clear distinction in pEFS between 0% and over 0% to less than 5% (M1 category), between 25% to less than 50% and 50% or over BM blasts on Day 15 (M3 category) (Figure 1). Of those patients with 75% blasts or more on Day 15, 55% nonetheless reached CR on Day 33 and had an 8y-pEFS of 42.9±12.6%. Among those patients with 25% to less than 50% blasts on Day 15, 92% reached CR on Day 33 with an 8y-pEFS of 57.5±5.5%.
Detailed analyses revealed marked differences within immunophenotypic subgroups. Results are, therefore, shown for each subgroup separately.
In pB-ALL (n=1,196; 1,187 patients with evaluable PR), patients with PGR had an 8y-pEFS of 80.2±1.2% (n=1,109) and PPR patients an 8y-pEFS of 58.4±5.6% (n=78) (P<0.001). Patients in the BMd15 subgroups had an 8y-pEFS of 86.5±1.3% (n=741), 74.2±2.5% (n=317), and 47.1±4.3% (n=138) for the M1, M2 and M3 groups, respectively. Though a smaller group of pB-ALL patients could be identified by PPR compared to M3 BMd15 (6.6% vs. 11.5%), the EFS of the patients in the M3 BMd15 group was even worse than the EFS of PPR patients, showing the better prognostic discriminative value of M3 BMd15. This was also reflected in the distribution of events: 12.7% of all events in pB-ALL (n=32 of 256) clustered in the PPR group, whereas 28.5% were detected in the BMd15 M3 group (n=72 of 256) (Figure 2A and B). Sensitivity of PR to predict poor BM response on Day 15 or Day 33 was low as only 27.9% of patients with M3 BMd15 and 56.7% of patients with NRd33 had shown PPR before. BMd15 allowed a clear separation of three different risk groups for patients with M1, M2 and M3 marrow within the subgroups of PGR and PPR patients (Figure 2A and B). There was no statistical difference in pEFS between patients in the same BMd15 subgroup when analyzed according to PR (Table 2).
Age and WBC as well as NCI risk criteria25,26 and risk group criteria of the ALL-BFM 95 trial (both using age at diagnosis and initial WBC) showed an additional prognostic value when analyzed in combination with BMd15 (Table 2).
Univariate results were confirmed by a multivariate Cox’s regression analysis including NCI risk criteria, PR, BMd15 and BMd33 as covariates. In this analysis, PR lost its prognostic significance whereas the NCI risk criteria, as well as BM response on Days 15 and 33, retained significance (Table 3).
In T-ALL (n=194; 191 patients with evaluable PR), PGR patients had an 8y-pEFS of 84.6±3.3% (n=130) and patients with PPR had an 8y-pEFS of 54.1±6.4% (n=61). The 8y-pEFS of patients with M1, M2 and M3 BMd15 was 86.9±3.3% (n=107), 78.8±6.7% (n=40), and 45.0±7.7% (n=47), respectively. Sensitivity of PR to predict poor BM response on Day 15 or Day 33 was better in T-ALL than in pB-ALL: 72.3% of the patients with M3 BMd15 and 81.8% of the patients with NRd33 had shown PPR before.
Among the patients with PPR, BMd15 was not able to characterize subgroups with significantly different outcomes (Figure 2D). In PGR, however, outcome of patients with BMd15 M3 was significantly worse (M3, 8y-pEFS 43.1±14.7%) than the M1 and M2 subgroup with similarly favorable results (M1: 8y-pEFS 91.1±3.0%; M2: 8y-pEFS 83.4±7.7%) (Figure 2C). The prognostic relevance of the PR within the BMd15 subgroups in T-ALL is illustrated by the reverse analysis in Table 2. Within the BMd15 M1 subgroup, patients with significantly worse pEFS could be identified through PPR whereas no difference in outcome was shown within the BMd15 M3 subgroup. Within the small BMd15 M2 subgroup, the difference between PGR and PPR did not reach statistical significance. Thus, by combining PR and BMd15, T-ALL patients can be stratified into two distinct risk groups: one including the patients with PGR plus M1 or M2 BMd15 (n=120, 8y-pEFS 89.5±2.9%) the other including all patients with PPR and/or M3 BMd15 (n=74, 8y-pEFS 52.1±5.9%) (P<0.001) (Figure 3).
NCI risk criteria had a borderline significant prognostic value in patients with M1 BMd15 but showed no statistic significance in patients with M2 or M3 BMd15 (Table 2).
Consistent with these results, multivariate Cox’s regression analysis including NCI risk criteria, PR, BMd15 and BMd33 as covariates revealed BMd15 M3 as the strongest independent adverse risk factor, and also marginal significance for PPR and NCI-HR (Table 3).
For more than 20 years, cytomorphological response has been the leading criterion for stratifying patients into risk groups within the ALL-BFM trials. Since the ALL-BFM 86 trial, cytomorphological response has been estimated very early during induction treatment using the PR as criterion for risk stratification. Cytomorphological treatment response in the BM, however, was evaluated only at the end of induction treatment (Day 33). Poor cytomorphological response at either response evaluation point qualified a patient for high-risk treatment.6,27,28
The prognostic significance of early reduction of leukemic blasts in BM at different time points during induction treatment was shown in a number of pediatric ALL trials10 and was implemented as risk stratification criterion in various international trials.12,13,29–34 Specificity of response evaluation might, nevertheless, vary depending on the time of response evaluation with regard to the therapy and the composition of the treatment.13,35,36
With the aim of prospectively assessing the prognostic value of an early cytomorphological response evaluation in the BM, a BM puncture on Day 15 of induction treatment was performed in addition to the evaluation of PR and BMd33 in ALL-BFM 95.21 However, whereas the very easy sampling and evaluation of the peripheral blood samples on Day 8 provided assessable PR samples for nearly all patients, 15.6% of the BM aspirates on Day 15 could not be assessed due to non-representative BM morphology.
Overall, the prediction of treatment outcome was possible with each of the three response parameters PR, BMd15 or BMd33. BMd15 allowed a better prediction of outcome than PR in pB-ALL as well as T-ALL but the additional prognostic value of PR depended on the immunopheno-type. In pB-ALL, BMd15 could identify three distinct risk groups, and the PR had no significant additional effect in patients stratified by BMd15. Biologically, this seems highly plausible considering the fact that the PR is measured after the administration of seven days of prednisone and one IT dose of MTX, while the evaluation of the BM on Day 15 reflects the response to 14 days of prednisone, one dose of vincristine, daunorubicin and asparaginase, respectively, and two doses of IT MTX. This might also indicate that, in pB-ALL, resistance to prednisone can be compensated by high sensitivity to other chemotherapeutic drugs and that high sensitivity to prednisone can be overridden by resistance to other agents. These biological considerations seem to be less applicable for T-ALL patients. Our data indicate that, in the end, resistance to prednisone (i.e. PPR) in T-ALL could not be overcome by the subsequent chemotherapy even in those patients who apparently had a reasonable response in the later course of induction treatment as reflected by M1 or M2 BMd15. The reliability of these data might be weakened due to the small patient numbers remaining in the T-ALL subgroups in this analysis. However, the results are supported by the recently published data on the prognostic impact of MRD in the AIEOP-BFM ALL 2000 trial.37 In this study, the PR in TALL also retained prognostic value (although this had only borderline significance) when analyzed in a multivariate model including the MRD risk groups. In contrast, in pB-ALL, PPR completely lost its adverse prognostic value if compared with PGR patients with the same PCR-MRD levels.38
Our data may suggest that the PR could be omitted as a stratification parameter for patients with pB-ALL. However, in the ALL-BFM 95 trial, the good outcome of pB-ALL patients with PPR and subsequently good BM response on Day 15 was achieved with an intensified high-risk treatment. Whether these results could be reproduced with less intensive treatment, remains unclear. Therefore, we think that the omission of the PR as a risk stratification parameter should not be considered for the moment.
In clinical practice, the question often arises as to whether an early change or intensification of treatment is reasonable in patients with poor early response. Our data show that patients with M3 marrow on Day 15 still have a good chance of achieving CR by end of induction. Among those patients with 25% to less than 50% BM blasts on Day 15, 92% achieved CR (71 of 77 pB-ALL and 15 of 17 T-ALL patients). Even a fraction of patients with 75% or over BM blasts on Day 15 reached CR by this time point (8 of 17 pB-ALL patients and 8 of 12 T-ALL patients). This suggests that if the aim is just to achieve remission there is no strong evidence for the need for alternative ALL treatment at this point. However, treatment results of these patients are poor and might be improved by early treatment intensification.
In the stratification of T-ALL patients, combining BMd15 with PR added significant value to the response parameters alone and allowed stratification into two widely separated risk groups, the better of them with an excellent 8y-pEFS of almost 90% and another poor risk group with an 8y-pEFS of nearly 50%, including 74% of all T-ALL events.
In pB-ALL, in contrast, the use of the PR in addition to BMd15 failed to improve the discrimination between risk groups obtained through BMd15 alone. Yet the combination of BMd15 with the ALL-BFM 95 risk criteria or the NCI criteria, both using age and initial WBC, gave an added prognostic value. In the COG (or former CCG) protocols, the combination of NCI risk criteria with early (Day 7 and Day 14) marrow response has been used for risk stratification for many years.12,39 The ALL IC-BFM study group introduced cytomorphological BM response on Day 15 in the non-MRD-based protocol ALL IC-BFM 2002 for a risk stratification system which was based on the ALL-BFM 95 criteria, but shifted the patients to a higher risk group in the case of an M3 BMd15.40 ALL IC-BFM 2002 was performed in countries which did not have access to MRD diagnostics, mainly due to economic concerns.40 For these countries, optimization of risk stratification by the intelligent use of clinical parameters and cytomorphological response evaluation is worthwhile. However, the prognostic relevance of cytomorphological response must always be interpreted in the context of the specific chemotherapy regimen administered. Therefore, we should approach transferring our data onto other and, in particular, less intensive treatment regimens with caution.
Results of the ALL IC-BFM 2002 study have not yet been published. It will be interesting to see whether the results of the current study, which were generated in a setting with centralized cytomorphology services and, therefore, a high level of staff continuity, can be reproduced in a setting with decentralized cytomorphology services.
Polymerase chain reaction41 and flow cytometry42 have been shown to detect MRD and help discriminate between patients with a differential response at later time points when patients have already reached morphological remission. Recently, the AIEOP-BFM group published data on a total of 3,648 ALL patients (pB-ALL: n=3184; T-ALL: n=464).37,38 In these studies, the 5y-pEFS of patients already MRD negative at end of induction (MRD-SR) were 92.3±0.9% (pB-ALL) and 93.0±3.0% (T-ALL), respectively. Considering together all pB-ALL and T-ALL patients, this group made up 39.0% of all patients. An equally good pEFS of 93.0±1.7% (8y-pEFS) was achieved in our study in those patients with 0% blasts in the BM on Day 15. However, this group made up only 15.1% of the study population, showing that the PCR-MRD technique is able to allocate more patients for less intensive treatment.
To summarize, in the context of the ALL-BFM 95 treatment, in pB-ALL the combination of BM response on Day 15 with the ALL-BFM 95 risk criteria allows a more subtle definition of risk groups. The PR, included as high-risk stratification criterion in ALL-BFM 95, completely lost its significance in combination with BMd15. In T-ALL, BMd15 was also a better predictor of outcome than PR, though within the subgroup of patients with M1 (and possibly also M2) BMd15, the PR added an important prognostic effect.
Today, the ALL-BFM 95 protocol is regularly used as a chemotherapy protocol for childhood ALL in several countries. Our data demonstrate that the inclusion of BMd15 crucially improves the ALL-BFM 95 risk stratification in the context of the ALL-BFM 95 therapy. This is of particular interest in less affluent countries where limited economic resources mean expensive laboratory techniques cannot be used.
We thank all patients and their families who participated in this trial, and the physicians, nurses and study nurses of all the hospitals involved for their input in performing this study. We thank N Götz, D Janousek, U Meyer, I Krämer and K Mischke for data management.
ML and AM contributed equally to this manuscript.
Funding: the clinical trial was supported by the Deutsche Krebshilfe, Bonn, Germany (50-2614-Ri 6; H.R.). This work was also supported by the Madeleine-Schickedanz-Kinderkrebsstiftung, Fürth, Germany, which provided a fellowship to ML.
The online version of this article has a Supplementary Appendix.
Authorship and Disclosures
The information provided by the authors about contributions from persons listed as authors and in acknowledgments is available with the full text of this paper at www.haematologica.org.
- Received May 31, 2011.
- Revision received January 5, 2012.
- Accepted January 11, 2012.
- Copyright© Ferrata Storti Foundation