-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

790 Independent, International Validation and Refinement of a Machine Learning Algorithm to Classify Acute Leukemia Using Routine Laboratory Features

Program: Oral and Poster Abstracts
Type: Oral
Session: 903. Health Services and Quality Improvement: Myeloid Malignancies: Innovative Approaches to Improve Quality of Care, Affordability, and Outcomes
Hematology Disease Topics & Pathways:
Lymphoid Leukemias, ALL, Acute Myeloid Malignancies, AML, Artificial intelligence (AI), Research, Clinical Practice (Health Services and Quality), Clinical Research, Diversity, Equity, and Inclusion (DEI), Diseases, Lymphoid Malignancies, Myeloid Malignancies, Technology and Procedures, Machine learning
Monday, December 9, 2024: 11:15 AM

Amin T. Turki, MD1,2*, Alberto Hernández Sánchez, MD3*, Wellington Silva4, Magdalena Karasek, MD5*, Luca Guarnera, MD6, Koray Yalçin, MD7*, Amir Enshaei8*, Marta Sobas, MD9*, Dirk Reinhardt, MD10, Maria M Rivas, MD11*, Deepak Kumar Mishra, MD12*, Eduardo Rego13*, Ahmet Koc14*, Paola Núñez Medina15*, Maria Teresa Voso, MD16, Anthony Moorman17*, Felix Nensa, MD18* and Merlin Engelke19*

1Institute for Artificial Intelligence in Medicine, University Hospital Essen, Essen, Germany
2Department of Hematology and Oncology, Marienhospital, Ruhr-University Bochum, Bochum, Germany
3Hematology Department, Hospital Universitario de Salamanca (CAUSA/IBSAL), Salamanca, Spain
4University of Sao Paulo, Faculdade De Medicina USP, Sao Paulo, BRA
5Department of Hematology, Blood Neoplasms and Bone Marrow Transplantation, Wroclaw Medical University, Wroclaw, Poland
6Tor Vergata University, Rome, Italy
7Bahcesehir University Medical Park Göztepe Hospital, Istanbul, Turkey, Istanbul, Turkey
8Wolfson Childhood Cancer Research Centre, Newcastle University, Newcastle, United Kingdom
9Department of Hematology, Blood Neoplasms and Bone Marrow Transplantation, Medical University of Wroclaw, Wroclaw, Poland
10University Children’s Hospital Essen. Department of Pediatric Hematology and Oncology, Essen, Germany
11Hospital Universitario Austral, Buenos Aires, Argentina
12Laboratory Hematology, Cytogenetics & Molecular Pathology, Tata Medical Center, Kolkata, West Bengal, IND
13Hospital das Clinicas da Faculdade de Medicina da Universidade de Sao Paulo, Sao Paulo, Brazil
14Department of Pediatric Hematology and Oncology, Marmara University Faculty of Medicine, Istanbul, Turkey
15Department of Hematology, University of Salamanca, Salamanca, Spain
16Department of Biomedicine and Prevention, University of Rome Tor Vergata, Rome, Italy
17Leukaemia Research Cytogenetics Group, Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
18Institute for AI in Medicine, University Hospital Essen, Essen, Germany
19Department of AI in Medicine, University Hospital Essen, Essen, DEU

The timely diagnosis of acute leukemias (AL) can be a challenge under constrained conditions. Patients in particular in low- and mid-income countries, suffer from various access barriers to specialized diagnosis. Delays in diagnosis and referral, especially for patients with acute promyelocytic leukemia (APL), increase early mortality (Rego Blood 2013, Odetola and Tallman ASH Educ Program 2023). Most recently, routine laboratory features have been leveraged to develop and test machine learning (ML) classification algorithms for predicting AL types on multicenter French cohorts (Alcazer, Lancet Digital Health, 2024). Yet, its global generalizability has not been extensively tested.

Methods:

To test these algorithms, we assembled a multicenter retrospective cohort of patients with diagnosed AL from 9 countries, whose laboratory features (total leukocytes, monocyte and lymphocyte counts, platelets, MCV, MCHC, LDH, fibrinogen, prothrombin activity in %, age) were obtained at the earliest timepoint of leukemia diagnosis at hospital contact. The cohort was inclusive of ethnic, social, and age diversity (range 0.08 – 97 years), included both sexes (female 42.7%), adult (≥ 18 years, n=1025) and pediatric patients (n=1771). The top-performing model in the development cohort, an extreme gradient boosting (XGB) model, was employed for testing. A Python package was developed that provides data preparation through HL7/FHIR or csv tables, predictions using an embedded R script, and evaluation using Weights & Biases. The model was run separately for each site to account for cohort heterogeneity. Missing features cutoff was 20%. Feature importance was analyzed by determining SHapley Additive exPlanations(SHAP) values. Misclassified patients were further analyzed regarding their features’ clinical significance and by statistical, machine-learning and dimensionality reduction methods. This study was approved by the ethics committee of the University of Duisburg-Essen (N°24-11882-BO)

Results:

In 2796 patients with diagnosed AL, the previously published “confident” predictions of the algorithm reached peak median AUROC of up to 99.7 for APL, 98.8 for acute myeloid leukemia (AML) and 98.8 for acute lymphoblastic leukemia (ALL). High scorings with “confident” predictions were obtained from Europe (e.g. F1 score AML 0.97 [95%CI, 0.972-0.973]), Asia (e.g. ALL F1 score 0.94 [95%CI, 0.937-0.943]) and Latin America (e.g. AML F1 0.98 [95%CI, 0.976-0.978]). “Confident” predictions, however, were only available for 41-5% of patients depending on cohorts. The accuracy “base” prediction of AL varied across sites and countries. ML predicted APL at median AUROC between 0.98 and 0.79 and other types of AML with median AUROC between 0.87 and 0.60. The best “base” algorithm performance was recorded for AML and APL with the data from Salamanca, indicating some feature dependencies of the algorithm.

In the pediatric subsets, ALL was the most frequently diagnosed leukemia, and cohorts reached a median AUROC of 0.78 (range 0.65-0.78), similar to adult ALL. However, the algorithm – originally developed on adult cohorts - did not generalize well for pediatric AML, its F1 scores (range 0.40-0.32) were lower than in pediatric ALL (range 0.72-0.68). We examined potential algorithm limitations, e.g., misclassified patients, to identify sources of bias. Higher proportions of missing values reduced the precision of the predictions, reason why we refined its cutoff. The most important features in SHAP analysis were prothrombin activity and monocyte count across predictions, for ALL also LDH, for AML MCV and age and for APL predictions fibrinogen and MCHC. Misclassified AML patients were predicted as ALL when having low monocyte counts or missing this feature. Few AML patients with impaired coagulation (e.g. PT <60) and normal leukocytes were misclassified as APL. Misclassified ALL patients with high monocyte counts, with higher MCV, and with lower LDH, were predicted as AML. We adjusted the scripts for limitations and statistical outliers to improve the algorithm’s applicability in clinical practice.

Conclusion:

Inclusive ML tools can reduce access barriers in hematology. This first international validation of an ML tool to support the diagnosis of AL provides important insight into its validity and practical use. Validating the model on more patients and countries will further inform its generalizability.

Disclosures: Turki: Biomarin, AMGEN: Speakers Bureau; Onkowissen.tv: Speakers Bureau; CSL Behring: Consultancy; Pfizer: Consultancy; Janssen: Other: Travel reimbursements; Neovii: Other: Travel reimbursements; Maat Pharma: Consultancy; Novartis: Other: Travel reimbursements. Reinhardt: Medac, BMS, Immedica: Research Funding. Voso: Novartis: Other: Research support, Speakers Bureau; Celgene/BMS: Other: Research support, Advisory Board, Speakers Bureau; Syros: Other: Advisory Board; Astra Zeneca: Speakers Bureau; Abbvie: Speakers Bureau; Jazz: Other: Advisory Board, Speakers Bureau; Astellas: Speakers Bureau. Nensa: Siemens Healthineers: Research Funding.

*signifies non-member of ASH