Type: Oral
Session: 803. Emerging Tools, Techniques, and Artificial Intelligence in Hematology: New Approaches to Predicting Patient Outcomes in Hematologic Malignancies
Hematology Disease Topics & Pathways:
Adult, Artificial intelligence (AI), Research, Health outcomes research, Clinical Research, Real-world evidence, Survivorship, Emerging technologies, Technology and Procedures, Human, Study Population, Machine learning
For the shared patient-physician informed consent process before allogeneic hematopoietic cell transplantation (alloHCT) it is important to determine the individual risk of a fatal outcome. It is dependent on patient, donor, disease characteristics and their interaction. To mediate this, scores like the Hematopoietic Cell Transplantation-specific Comorbidity Index (HCT-CI) are used. These lack accuracy and only capture a limited number of variables. Machine learning models can utilize feature dependencies and interactions better than a simple risk score, resulting in higher accuracy and insight into feature importance.
Methods
Data from 491 patients with hematological malignancies undergoing alloHCT at our center between 2008 and 2022 were used to compare four algorithms: Decision tree, AdaBooost, gradient boosting and Random forest (RF). 28 features were used for model generation to predict death within the first year after alloHCT. These included known risk factors such as the HCT-CI and immunological pre-transplant data. A 70/30 stratified data split into training and test set was performed. The test set was only used after model selection and value acquisition to assure generalizability.
With the training set, we performed a 10x repeated five-fold nested cross validation. The outer loop of the nested cross validation was used for the acquisition of receiver operator characteristics curve (AUCs) to assess algorithm performance. Hyperparameter selection was performed in the internal loop of the cross validation. For feature analysis we used SHAP values (Lundberg & Lee, 2017).
Results
Of four tree based models, RF showed the best performance for prediction of death within the first year after alloHCT with an AUC of 0.77 (± 0.01) in the training set compared to 0.75 (± 0.01) for gradient boosting, 0.74 (± 0.01) for AdaBooost and 0.67 (± 0.01) for the decision tree. RF can therefore be used as a reliable prediction tool and was used for data exploration. In the test set RF achieved an AUC of 0.79.
Using RF data exploration with SHAP values the two factors with highest influence on prediction of death were age (average mean absolute SHAP value 0.066) and the pre-transplant B-lymphocyte count (0.057). The HCT-CI was only the 8th most important feature for our model with less than half impact compared to pre-transplant B-cells (0.024). The features ranked third to seventh in importance were known risk factors: pre-transplant performance status (0.042), pre-transplant lactate dehydrogenase levels (0.037), conditioning intensity (0.032), pre-transplant disease activity (0.029) and presence of HLA-mismatch (0.025).
Concerning the two most important features, age and B-lymphocyte count, prediction direction from survival to death changed at age 63, and at a pre-transplant B-cell count of 20 cells/µl. Based on this we deduced a simple risk score with three groups: older, younger/B-cell depleted, younger/immunocompetent. We compared our three groups to the risk stratification using the HCT-CI (0, 0-2, ≥3). Our score achieved significant separation for overall survival (p<0.01 for the three groups) with survival probability 1 year post alloHCT for younger/immunocompetent at 97%, B-cell depleted at 76% and older at 66%. There was no significant separation between the HCT-CI based groups (p>0.05) with survival probability 1 year post alloHC for 0 points at 85%; 1-2 at 82% and ≥3 at 74%.
Thus, by deducing from a reliable machine learning model, we could achieve effective risk stratification based on age and pre-transplant humoral status, outperforming the HCT-CI.
Conclusion
Using SHAP values we identified both known (age) and unkonwn (pre-transplant B-cell deficiency) risk factors for mortality post alloHCT. B-lymphocyte mediated, immunologic mechanisms could influence mortality after alloHCT. This needs to be investigated in animal models. The risk prediction of mortality by B-cell depletion could also be an easy to access surrogate marker for previous treatment intensity and disease severity.
In summary, our study shows how model generation and feature analysis in transplant hematology can be interconnected to build reliable models, retain a level of explainability and identify new biomarkers.
Disclosures: Wäsch: Amgen,BMS/Celgene, Janssen, Kite/Gilead, Novartis, Pfier, Sanofi: Consultancy; Janssen, Sanofi: Research Funding; Abbvie,Amgen, BMS/Celgene, Janssen, Kite/Gilead, Pfizer, Sanofi: Honoraria. Zeiser: Neovii: Consultancy; Mallinkrodt: Consultancy, Honoraria; Incyte: Consultancy, Honoraria; Medac: Honoraria; Sanofi: Honoraria; Novartis: Consultancy, Honoraria; Ironwood Pharmaceuticals, Inc.: Consultancy. Wehr: Jazz: Honoraria, Other: travel grant; MSD: Honoraria, Other: travel grant; Takeda: Honoraria, Other: travel grant.