Session: 509. Bone Marrow Failure and Cancer Predisposition Syndromes: Congenital: Poster I
Hematology Disease Topics & Pathways:
Acquired Marrow Failure Syndromes, Research, Artificial intelligence (AI), Inherited Marrow Failure Syndromes, Bone Marrow Failure Syndromes, Clinical Practice (Health Services and Quality), Aplastic Anemia, Clinical Research, Genetic Disorders, Bioinformatics, Diseases, Real-world evidence, Registries, Technology and Procedures, Machine learning
To implement diagnostic work-up, we applied machine learning (ML), conducting both supervised and unsupervised analyses to a cohort of 140 patients (85/140 males and 55/140 females, median age of 13.28 years - ranging from 0.25 to 61.29) referred at our Hematology Unit of the Giannina Gaslini Institute from 1989 to April 2023 for persistent cytopenia and/or features suggestive of TBD. For each patient clinical, biochemical, and genetic features were collected. Patients were labeled as “TBD” (n=20), “Other congenital diseases” (n=27), and “Undefined diagnosis” (n=93), according to their molecular diagnosis.
In the supervised analysis, a Random Forest model was trained on the subset of patients with confirmed diagnosis (n=47), achieving an accuracy prediction of 75% for "TBD" patients and 96% for “Other congenital diseases” patients. Model was then tested on the subset of undiagnosed patients, 16/93 were predicted as potential TBD and 77/93 as potential Other congenital diseases. This equals to 17.2% and 82.7% of possibly reallocated diagnoses respectively.
In the unsupervised analysis, the whole cohort (n=140) was clustered in 4 subgroups (numbered from 1 to 4). An association analysis revealed a statistically significant association (p value: 1x10-6) between clusters and molecular diagnoses: in clusters 1 and 2 there was a strong prevalence of TBD patients, whereas in clusters 3 and 4 Other congenital and Undefined diagnoses prevailed.
In both analyses, Telomere Length (TL) and mucocutaneous abnormalities were the most important features. Supervised analysis revealed these features to be the most relevant drivers in discriminating between patients with TBD and patients with other diagnoses while unsupervised analysis detected cluster 1 showing a prevalence of these features.
Interestingly the two approaches, despite having different assumptions, yielded similar results in the "Undefined Diagnosis" setting: all 16/93 patients without molecular diagnosis predicted as TBD in the supervised analysis were located in “TBD clusters” 1-2 in the unsupervised analysis, including 5/16 patients with a variant of uncertain significance (VUS) on a TBD gene.
These results suggest that this methodology might correctly re-reallocate a remarkable proportion of undefined or wrongly defined diagnoses thus potentially implementing diagnostic work up of rare diseases like TBD that, due to intrinsic difficulties, might be missed or remain undiagnosed.
In addition, this model can be helpful in clinical routine to identify undiagnosed patients in whom genetic testing might reveal uncharacterized mutations and may deserve careful follow-up.
Disclosures: Beier: RepeatDx: Other: Scientifc collaboration; Sobi: Honoraria; Alexion: Honoraria; Pfizer: Honoraria. Brummendorf: Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Patents & Royalties: Combination of Imatinib with hypusination inhibitors, Research Funding; Gilead: Consultancy, Honoraria; Pfizer: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Merck: Honoraria; Roche: Honoraria; Ariad: Consultancy, Honoraria; Repeat Dx: Consultancy, Research Funding. Dufour: Novartis: Consultancy; Sobi: Consultancy; Pfizer: Consultancy, Speakers Bureau; Gilead: Consultancy; Ono: Consultancy; Rockets: Consultancy.
See more of: Oral and Poster Abstracts