-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

3211 A Machine Learning Based Model to Predict Two-Year Leukemia Free Survival in Cord Blood Transplantation for Acute Leukemia - a Data Mining Study, on Behalf of Eurocord, Cord Blood Committee and the Acute Leukemia Working Party of the EBMT

Clinical Allogeneic Transplantation: Results
Program: Oral and Poster Abstracts
Session: 732. Clinical Allogeneic Transplantation: Results: Poster II
Sunday, December 6, 2015, 6:00 PM-8:00 PM
Hall A, Level 2 (Orange County Convention Center)

Roni Shouval, MD1*, Annalisa Ruggeri, MD, PhD2,3*, Myriam Labopin, MD4*, Mohamad Mohty, MD, PhD5, Guillermo Sanz, MD, PhD6, Gerard Michel, MD7*, Eefke Petersen, MD, PhD8, Patrice Chevallier, MD, PhD9*, Amal Al-Seraihy, MD10*, Noel-Jean Milpied, MD, PhD11, Cristina Diaz de Heredia, MD12*, William Arcese, MD13, Didier Blaise, MD, PhD14, Vanderson Rocha, MD, PhD15,16, Amit Gal, MSc17*, Ron Unger, PhD18*, Frederic Baron, MD, PhD19, Peter Bader, MD20,21, Eliane Gluckman, MD22 and Arnon Nagler, MD, MSc4,23

1Division of Hematology and Bone Marrow Transplantation, Sheba Medical Center, Ramat Gan, Israel
2Eurocord, Hôpital Saint Louis APHP, University Paris-Diderot, Paris, France
3Service d’Hématologie et Thérapie Cellulaire, Hôpital Saint-Antoine, Paris, France
4EBMT, Acute Leukemia Working Party, Paris, France
5Hematology Department, Saint-Antoine Hospital, AP-HP, Universite Pierre et Marie Curie, Paris, France
6Servicio de Hematologia, Hospital Universitario La Fe, Valencia, Spain
7Timone Enfants Hospital and Aix-Marseille University, Department of Pediatric Hematology and Oncology, Marseille, France
8University Medical Centre Utrecht, Utrecht, Netherlands
9Department of Hematology, Nantes University Hospital, Nantes, France
10Department of Pediatric Hematology/Oncology, King Faisal Specialist Hospital & Research Center, Riyadh, Saudi Arabia
11Service des maladies du sang, Hopital du Haut Leveque, bordeaux, France
12Servicio de Hematologia y Oncologia Pediátrica, Hospital Vall d'Hebron, Barcelona, Spain
13Dept. of Hematology and Transplant, University of Rome 'Tor Vergata', Rome, Italy
14Programme de Transplantation et Therapie Cellulaire, Institut Paoli Calmettes, Marseille, France
15Churchill Hospital, Oxford University, Oxford, United Kingdom
16Eurocord - Monacord, Hôpital Saint Louis, Paris, France
17Tel Aviv University, Ra'anana, Israel
18The Mina & Everard Goodman Faculty of Life Sciences, Ramat Gan, Israel
19University of Liege, Liege, Belgium
20EBMT, Paediatric Diseases Working Party, barcelona, Spain
21Division for Stem Cell Transplantation and Immunology, Hospital for Children and Adolescents, University Hospital Frankfurt, Goethe University, Frankfurt am Main, Germany
22Eurocord-Hopital Saint-Louis, Paris, France
23Hematology Division, Chaim Sheba Medical Center and Tel Aviv University, Tel-Hashomer, Ramat-Gan, Israel

Background: Umbilical cord blood transplantation (UCBT) is a potentially curative therapy acute leukemia (AL) patients. Transplantation benefit must be balanced against risks, such as transplant related mortality and relapse. The complex nature of hematopoietic stem cell transplantation data (HCT), rich in interactions and possibly nonlinear associations, has motivated us to apply machine learning (ML) for predictive modeling. ML is a field of artificial intelligence and is part of the data mining approach for data analysis.

Our group has recently reported on a ML based prediction model for short term HCT outcomes (Shouval R et al; JCO 2015). Using a ML algorithm, the perspective of the current study was prediction of leukemia free survival (LFS) at 2 years after an UCBT, while exploring variables' importance and interactions.

Patients & Methods: A cohort of 3,149 UCBT were analyzed. Inclusion criteria encompassed patients at all ages, undergoing an UCBT (single/double unit) in EBMT centers from the year 2004 to 2014, for AL, in all disease status. All conditioning and graft versus host disease prophylaxis regiments were included. A total of 24 variables were considered, including the number of total nucleated cell dose (TNC), donor and recipients HLA typing, as well as recipient, disease and transplant characteristics.

The Random Survival Forest (RSF) ML algorithm was applied for model construction and data exploration. RSF is known to be adaptive to data, is able to automatically recover nonlinear effects and complex interactions among variables, and yields nonparametric prediction over test data. The analysis pipeline consisted of prediction model development, assessment of variable importance by their minimal depth from the tree trunk, and exploration of the top ranking variable with dependence plots. The latter promotes understanding of non-trivial associations between variables and outcomes.

Results: The 2 years LFS was 49%, with a median follow up of 30 months. A RSF model of 1000 trees was developed, with each tree constructed on a bootstrap sample from the original cohort. A prediction error of 36.0% was calculated. The 10 most predictive variables (in ascending order) were disease status, age, TNC harvested and infused, recipient CMV serostatus, interval from diagnosis to UCBT, transplant year, previous autologous transplant, and use of anti-thymocyte globulin (ATG(.

Selected findings from exploration of variables-outcome relationship with dependence plots included a varying effect of TNCs in specific subpopulations. Increasing the number of infused TNCs had a positive effect on predicted LFS in patients receiving HLA mismatched (2 or more HLA mismatch) (figure) or single unit CB grafts, and patients in earlier disease status or older age. ATG administration was associated with worse LFS, whether unadjusted or adjusted to all other variables. However, there was an additional negative effect in advanced disease status patients, recipients of HLA mismatched or single CB units grafts, and older patients. Patients in 1st complete remission (CR) had higher predicted LFS as compared to those in 2nd CR. However, in patients receiving a HLA mismatched or a double CB graft, the difference in LFS between CR1 and CR2 was attenuated. Younger age had a favorable impact in early disease status, but lost its positive effect in advanced disease.

Conclusions: A prediction model for LFS 2 years post UBCT was developed using the RSF ML algorithm. Variables were ranked according to their predictive contribution. Disease status, age, and TNC count were found to be the most important factors. Dependence plots revealed interactions and nonlinear associations between variables and the outcome, such as the effect of cell dose on HLA disparity. Apart from the study's clinical findings, it carries a methodological significance. A novel ML approach for prediction, variable selection and data exploration, accounting for long term time to event outcomes, has proved useful in the field of HCT.

Figure: Variable marginal dependence coplot of predicted LFS at 2 years against TNC, conditional on HLA matching. Individual cases are marked with blue circles (alive or censored) and red `x's (event). Linear smooth (a linear extrapolation of the prediction function), with shaded 95% confidence band, indicates trends of variable dependence.

 

Disclosures: Mohty: Janssen: Honoraria ; Celgene: Honoraria . Sanz: JANSSEN CILAG: Honoraria , Research Funding , Speakers Bureau . Bader: Neovii: Other: Institutional grants ; Medac: Other: Institutional grants ; Riemser: Other: Institutional grants ; Amgen: Consultancy ; Novartis: Consultancy ; Jazz Pharmaceuticals: Consultancy .

*signifies non-member of ASH