-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

4415 Multi-Modal Analysis and Federated Learning Approach for Classification and Personalized Prognostic Assessment in Myeloid Neoplasms

Program: Oral and Poster Abstracts
Session: 637. Myelodysplastic Syndromes – Clinical and Epidemiological: Poster III
Hematology Disease Topics & Pathways:
Research, Acute Myeloid Malignancies, MDS, AML, artificial intelligence (AI), adult, Translational Research, elderly, Clinical Research, bioinformatics, Chronic Myeloid Malignancies, CMML, Diseases, real-world evidence, survivorship, Myeloid Malignancies, Technology and Procedures, Study Population, Human, machine learning, molecular testing
Monday, December 12, 2022, 6:00 PM-8:00 PM

Saverio D'Amico, MSc1*, Lorenzo Dall'Olio, PhD2*, Cesare Rollo, PhD3*, Patricia Alonso, PhD4*, Iñigo Prada-Luengo, PhD5*, Daniele Dall'Olio, PhD2*, Claudia Sala, PhD2*, Matteo Bersanelli, PhD6*, Elisabetta Sauta, PhD1*, Marilena Bicchieri, PhD6*, Pierandrea Morandini, MEng1*, Tobia Tommasini, MSc1*, Victor Savevski, MEng1*, Lin-Pierre Zhao, MD7*, Uwe Platzbecker, MD8, Maria Diez-Campelo, MD9*, Valeria Santini, MD10, Pierre Fenaux11, Torsten Haferlach, MD12, Anders Krogh, PhD5*, Santiago Zazo, PhD4*, Piero Fariselli, PhD3*, Tiziana Sanavia, PhD3*, Matteo G. Della Porta, MD13* and Castellani Gastone, PhD2*

1Artificial Intelligence Center, Humanitas Research Hospital, Rozzano (Milan), Italy
2DIMES, University of Bologna, Bologna, Italy
3University of Turin, Torino, Italy
4Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
5University of Copenhagen, Copenhagen, Denmark
6IRCCS Humanitas Research Hospital, Rozzano, Italy
7Department of Hematology and Bone Marrow Transplantation, Hôpital Saint-Louis, Assistance Publique des Hôpitaux de Paris (AP-HP), Paris, France
8Medical Clinical and Policlinic 1, Hematology and Cellular Therapy, University Hospital Leipzig, Leipzig, Germany
9Department of Hematology. Salamanca-IBSAL university hospital, Salamanca, salamanca, Spain
10MDS Unit, University of Florence, AOUC, Florence, Italy
11Service d'Hématologie Séniors, Hôpital Saint-Louis, Université Paris 7, Paris, France
12MLL Munich Leukemia Laboratory, Munich, Germany
13Cancer Center, Humanitas Research Hospital and Center for Accelerating Leukemia Research (CALR), Humanitas University, Artificial Intelligence Center, Milan, Italy

Background

Myeloid neoplasms (MN) present clinical and molecular heterogeneity and therefore a risk-adapted treatment strategy is mandatory. In MN, classification and prognostic tools based on clinical and morphologic criteria are being complemented by introducing genomic features. The clinical implementation of next-generation classifications and prognostic systems requires the availability of a robust methodological framework together with a solution to provide access to these technologies for clinicians.

Aims

Machine learning (ML) and Deep Learning (DL) approaches produce powerful predictive models and offer explainable solutions to assure full interpretability of a model when applied in clinical settings. Here we provided a comprehensive assessment of explainable ML/DL-based methods for classification and prognostic assessment of MN and we developed a solution to apply these methods across different clinical Centres through a Federated Learning (FL) approach.

Methods

We analysed two cohorts of patients from GenoMed4All consortium with myelodysplastic syndrome (MDS), n=2,043 and n=2,384, with available clinical and molecular features to train and validate the models. Methods were then applied to other MN, i.e. acute myeloid leukemia (AML, n=1154) and chronic myelomonocytic leukemia (CMML, n=1037). We stratified patients by two clustering approaches based on Hierarchical Dirichlet Process (HDP) and HDBSCAN combined with UMAP data reduction. We trained a Random Forest (RF) classifier to assign new patients to the existing clusters, considering Balanced Accuracy (BA) and Cohen’s K (CK) as performance metrics. We then compared different survival prediction methods: CoxPH model (and its penalized version), Random Survival Forests, DeepCox, Gradient Boosting and XGboost survival methods. Models’ explainability was performed through SHapley Additive exPlanations approach (SHAP). C-index was used to evaluate the models performance. Finally, we developed a Federated Learning (FL) environment together with an imputation approach to handle missing values by a deep decoder model.

Results

In MDS training cohort, we identified 18 and 8 clusters by using HDBSCAN and HDP, respectively (Figure 1). We measured the average Silhouette Coefficient on the data space obtaining the following performance in terms of classification task: HDBSCAN (BA:92.7±1.3%, CK:92.1±1.4%) and HDP (BA:85.8±0.8%, CK:83.3±0.9%). Similar distributions were observed when focusing on the validation cohort. Model explainability analysis (SHAP) showed that in both populations similar features drive patients’ classification.

Comparison of survival prediction for MDS is displayed in Figure 2, showing the models' performance in the two cohorts considering demographics, clinical, cytogenetics and genomic features. Non-linear ML/DL-based methods outperformed classical CoxPH-based approaches without requiring huge data pre-processing. Moreover, all the models showed higher C-indices with respect to that of conventional IPSS-R score. SHAP analysis showed similar feature importance ranking for both training and validation cohorts. Models were then applied to AML and CMML cohorts, showing consistent results across different type of MN.

Finally, we aimed to develop a federated learning (FL) solution (FedAvg, with a deep decoder model for missing data imputation) to favour a wide clinical implementation of the models. Data were collected to a single server and used to build and train a centralized model. Using global data training was expected to improve the model efficiency. This approach also ensured that the data in each node adhere to data privacy policies. We implemented CoxPH model in a setting of 3 nodes (Centers) respectively contributing to 60%, 30% and 10% of the training data. We observed that the poor node (i.e., node contributing to 10% of data) benefit from FedAvg with respect to working on an isolated setting (C-index 0.63 vs. 0.54). The centralized model trained on the whole dataset presented the highest efficiency (C-index 0.74).

Conclusion

Machine Learning/Deep Learning approach produces explainable and robust solutions to optimize classification and prognostic assessment in MN, as a basis for personalized medicine programs in these disorders. Federate learning algorithms allow a wide clinical implementation of the models by ensuring high performance and data protection.

Disclosures: Platzbecker: Geron: Honoraria; Silence Therapeutics: Honoraria; Janssen: Honoraria; Takeda: Honoraria; Novartis: Honoraria; Jazz: Honoraria; Abbvie: Honoraria; BMS/Celgene: Honoraria. Diez-Campelo: BluePrint: Membership on an entity's Board of Directors or advisory committees; Takeda: Honoraria, Membership on an entity's Board of Directors or advisory committees; Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees; Bristol Myers Squibb: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding. Santini: Takeda: Membership on an entity's Board of Directors or advisory committees; Syros: Membership on an entity's Board of Directors or advisory committees; Servier: Membership on an entity's Board of Directors or advisory committees; Otsuka: Membership on an entity's Board of Directors or advisory committees; Novartis: Honoraria, Membership on an entity's Board of Directors or advisory committees; Menarini: Membership on an entity's Board of Directors or advisory committees; Geron: Membership on an entity's Board of Directors or advisory committees; BMS: Honoraria, Membership on an entity's Board of Directors or advisory committees; AbbVie: Membership on an entity's Board of Directors or advisory committees. Fenaux: AbbVie, BMS, Janssen, Jazz, Novartis: Consultancy, Honoraria, Research Funding. Haferlach: Munich Leukemia Laboratory: Current Employment, Other: Part ownership.

*signifies non-member of ASH