-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

1327 Application of Machine Learning in the Diagnostic Work-up of Telomere Biology Disorders

Program: Oral and Poster Abstracts
Session: 509. Bone Marrow Failure and Cancer Predisposition Syndromes: Congenital: Poster I
Hematology Disease Topics & Pathways:
Acquired Marrow Failure Syndromes, Research, Artificial intelligence (AI), Inherited Marrow Failure Syndromes, Bone Marrow Failure Syndromes, Clinical Practice (Health Services and Quality), Aplastic Anemia, Clinical Research, Genetic Disorders, Bioinformatics, Diseases, Real-world evidence, Registries, Technology and Procedures, Machine learning
Saturday, December 7, 2024, 5:30 PM-7:30 PM

Erika Massaccesi, MD, PhD1*, Luca Arcuri1*, Giacomo Cavalca2,3*, Fabian Beier, MD4*, Michela Lupia1*, Davide Cangelosi2*, Alice Grossi5*, Marina Lanciotti1*, Filomena Pierri6*, Francesca Fioredda, MD1*, Maurizio Miano1*, Gianluca Dell'Orso1,7*, Maria Carla Giarratana1*, Daniela Guardo1*, Lucia Vankann8*, Francesca Bagnasco9*, Isabella Ceccherini5*, Paolo Uva2*, Tim H. H. Brummendorf, MD4 and Carlo Dufour, MD1

1Hematology Unit, IRCCS Istituto Giannina Gaslini, Genoa, Italy, Genoa, Italy
2Clinical Bioinformatics Unit, IRCCS Istituto Giannina Gaslini, Genoa, Italy, Genoa, Italy
3University of Bologna, Bologna, Italy
4Department of Hematology, Oncology, Hemostaseology, Stem Cell Transplantation, Medical Faculty, RWTH Aachen University, Germany, Aachen, Germany
5Laboratory of Genetics and Genomics of Rare Diseases, IRCCS Istituto Giannina Gaslini, Genoa, Italy, Genoa, Italy
6Hematopoietic Stem Cell Transplantation Unit, IRCCS Istituto Giannina Gaslini, Genoa, Italy, Genova, Italy
7Haematology Unit IRCCS Gaslini Children Hospital, Genoa, Italy
8Department of Hematology, Oncology, Hemostaseology, and Stem Cell Transplantation, Medical Faculty, RWTH Aachen University, Aachen, Germany
9Biostatistics Unit, Scientific Directorate, IRCCS Istituto Giannina Gaslini, Genoa, Italy, Genoa, Italy

Telomere biology disorders (TBD) are heterogeneous diseases whose diagnosis can be very challenging.

To implement diagnostic work-up, we applied machine learning (ML), conducting both supervised and unsupervised analyses to a cohort of 140 patients (85/140 males and 55/140 females, median age of 13.28 years - ranging from 0.25 to 61.29) referred at our Hematology Unit of the Giannina Gaslini Institute from 1989 to April 2023 for persistent cytopenia and/or features suggestive of TBD. For each patient clinical, biochemical, and genetic features were collected. Patients were labeled as “TBD” (n=20), “Other congenital diseases” (n=27), and “Undefined diagnosis” (n=93), according to their molecular diagnosis.

In the supervised analysis, a Random Forest model was trained on the subset of patients with confirmed diagnosis (n=47), achieving an accuracy prediction of 75% for "TBD" patients and 96% for “Other congenital diseases” patients. Model was then tested on the subset of undiagnosed patients, 16/93 were predicted as potential TBD and 77/93 as potential Other congenital diseases. This equals to 17.2% and 82.7% of possibly reallocated diagnoses respectively.

In the unsupervised analysis, the whole cohort (n=140) was clustered in 4 subgroups (numbered from 1 to 4). An association analysis revealed a statistically significant association (p value: 1x10-6) between clusters and molecular diagnoses: in clusters 1 and 2 there was a strong prevalence of TBD patients, whereas in clusters 3 and 4 Other congenital and Undefined diagnoses prevailed.

In both analyses, Telomere Length (TL) and mucocutaneous abnormalities were the most important features. Supervised analysis revealed these features to be the most relevant drivers in discriminating between patients with TBD and patients with other diagnoses while unsupervised analysis detected cluster 1 showing a prevalence of these features.

Interestingly the two approaches, despite having different assumptions, yielded similar results in the "Undefined Diagnosis" setting: all 16/93 patients without molecular diagnosis predicted as TBD in the supervised analysis were located in “TBD clusters” 1-2 in the unsupervised analysis, including 5/16 patients with a variant of uncertain significance (VUS) on a TBD gene.

These results suggest that this methodology might correctly re-reallocate a remarkable proportion of undefined or wrongly defined diagnoses thus potentially implementing diagnostic work up of rare diseases like TBD that, due to intrinsic difficulties, might be missed or remain undiagnosed.

In addition, this model can be helpful in clinical routine to identify undiagnosed patients in whom genetic testing might reveal uncharacterized mutations and may deserve careful follow-up.

Disclosures: Beier: RepeatDx: Other: Scientifc collaboration; Sobi: Honoraria; Alexion: Honoraria; Pfizer: Honoraria. Brummendorf: Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Patents & Royalties: Combination of Imatinib with hypusination inhibitors, Research Funding; Gilead: Consultancy, Honoraria; Pfizer: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Merck: Honoraria; Roche: Honoraria; Ariad: Consultancy, Honoraria; Repeat Dx: Consultancy, Research Funding. Dufour: Novartis: Consultancy; Sobi: Consultancy; Pfizer: Consultancy, Speakers Bureau; Gilead: Consultancy; Ono: Consultancy; Rockets: Consultancy.

*signifies non-member of ASH