-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

902 Synthetic Histopathological Images Generation with Artificial Intelligence to Accelerate Research and Improve Clinical Outcomes in Hematology

Program: Oral and Poster Abstracts
Type: Oral
Session: 803. Emerging Tools, Techniques and Artificial Intelligence in Hematology: Image-Based Machine Learning in Hematology
Hematology Disease Topics & Pathways:
Research, artificial intelligence (AI), Acute Myeloid Malignancies, AML, MDS, adult, Translational Research, MPN, elderly, Clinical Research, Chronic Myeloid Malignancies, CMML, Diseases, real-world evidence, Myeloid Malignancies, Biological Processes, Technology and Procedures, multi-systemic interactions, Study Population, Human, imaging, machine learning, natural language processing, omics technologies, Pathology
Monday, December 11, 2023: 3:00 PM

Gianluca Asti, MSc1*, Saverio D'Amico, MSc1*, Nico Curti, PhD2,3*, Gianluca Carlini, PhD2,3*, Elisabetta Sauta, PhD1*, Nicolas Riccardo Derus, PhD4*, Daniele Dall'Olio, PhD4*, Claudia Sala, PhD4*, Lorenzo Dall'Olio, PhD2,3*, Luca Lanino, MD5, Giulia Maggioni, MD1*, Alessia Campagna1*, Marta Ubezio, MD1*, Antonio Russo, MD1*, Gabriele Todisco, MD1*, Cristina Astrid Tentori, MD1*, Pierandrea Morandini, MEng1*, Marilena Bicchieri, PhD1*, Maria Chiara Grondelli, BSc1*, Matteo Zampini, PhD1*, Victor Savevski, MEng1*, Armando Santoro, MD6,7*, Shahram Kordasti, MD, PhD8, Valeria Santini, MD9, Anne Sophie Kubasch, MD10*, Uwe Platzbecker, MD11, Maria Diez-Campelo, MD, PhD12*, Pierre Fenaux, MD, PhD13, Lin-Pierre Zhao, MD13*, Amer M. Zeidan, MBBS, MHS14, Torsten Haferlach, MD, PhD15, Gastone Castellani, PhD4* and Matteo Giovanni Della Porta, MD1,16*

1Humanitas Clinical and Research Center, IRCCS, Rozzano, Italy
2Department of Physics and Astronomy, University of Bologna, Bologna, Italy
3Data Science and Bioinformatics Laboratory, IRCCS Institute of Neurological Sciences of Bologna, Bologna, Italy
4University of Bologna, Bologna, Italy
5Humanitas Clinical and Research Center, IRCCS, Rozzano, Milano, Italy
6Humanitas Cancer Center, Rozzano, Mi, Italy
7Humanitas University, Pieve Emanuele and IRCCS Humanitas Research Hospital- Humanitas Cancer Center Rozzano, Milan, Italy
8King's College London, London, United Kingdom
9MDS Unit, DMSC, AOU Careggi, University of Florence, Firenze, Italy
10Department of Hematology, Cellular Therapy, Hemostaseology and Infectious Diseases, University Medical Center Leipzig, Leipzig, Germany
11Medical Clinic and Policlinic 1, Hematology and Cellular Therapy, University Hospital Leipzig, Leipzig, Germany
12Department of Hematology, Salamanca-IBSAL University Hospital, Salamanca, Spain
13Saint Louis hospital APHP, Paris, France
14Section of Hematology, Department of Internal Medicine, Yale School of Medicine and Yale Cancer Center, New Haven, CT
15MLL Munich Leukemia Laboratory, Munich, Germany
16Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Italy

Background: Hematological malignancies are rare and complex diseases and as a consequence, multimodal data (ranging from clinical and genomic information to images) are required to improve diagnosis, prognosis and personalized treatments. However, collecting all these layers of information is challenging, in particular when collecting cytological and histological images from the bone marrow (BM) reproducing disease morphologic features. Synthetic data generation by Artificial Intelligence (AI) can circumvent these issues by generating images conditioned from textual inputs (i.e. reports from pathologists), which are widely available and contain many useful clinical information. This technology can enrich data with synthetic images, thus boosting translational research and improving the performances of precision medicine strategies based on multimodal information.

Aims: This project was conducted by GenoMed4all and Synthema EU consortia, with the aim to: 1) Apply generative models to real-world dataset with histological images of patients with myeloid neoplasms (MN). 2) Develop a Synthetic Images Validation Framework (SIVF) to evaluate the utility and fidelity of generated images. 3) Verify the capability of synthetic images to accelerate research and to improve clinical models.

Methods: We implemented Stable Diffusion (SD) generative model fine-tuned on hematological data to generate Hematoxylin and Eosin (H&E) images of MN patients. We implemented a domain specific language model (HematoBERT) to encode textual input as condition for the generation process. Use cases were Myelodysplastic Syndrome (MDS), Acute Myeloid Leukemia (AML) and Myeloproliferative Neoplasm (MPN) patients, with available BM biopsies and their reports from pathologists, genomic and clinical data. We applied SIVF to evaluate distributions of morphological features extracted from real and synthetic images.

Clinical validation was performed on disease classification and survival probability prediction, using real and synthetic images features (experimental setting is reported in Figure 1).

Results: We trained SD model on 200 patients with available BM biopsies and associated reports. We first performed SIVF to compare extracted morphological features (geometrical, color and texture features of cells nuclei) from synthetic and real images of 55 patients never seen by the model. Results proved that features distributions and correlations in both datasets were comparable. Similar results were obtained performing SIVF on each single patient data.

We verified if synthetic data augmentation could improve performances on MN classification (i.e. models able to correctly assign a single patient to a specific clinical entity according to the 2022 WHO classification criteria). We implemented three XGBOOST models to classify patients’ disease. Classifiers were trained and validated on morphological extracted features of images from a real set of patients (n=614), a synthetic group (n=396) and a mixed dataset (n=1010). Data augmentation improved classification performance by 10% (F1 Score) when we tested it on the three different validation sets.

Finally, demographics, clinical features, genomics (cytogenetics and gene mutations) were included as covariates together with morphological features extracted from BM biopsies in L1 penalized Cox’s proportional hazards models, considering Overall Survival as primary endpoint. Models were fitted on two different cohorts of real patients (n=182, n=294). Then we added 112 synthetic patients to both sets and refitted the models. We observed an improvement in performances of >10% (C-Index) for both cases (Figure 2), with morphological features (such as “major axis” of nuclei) being selected among the best predictors.

All these results confirmed that data augmentation through synthetic data is a viable approach and can significantly improve the models capability to efficiently capture clinical outcomes at individual patient level.

Conclusion: AI generated images preserve properties of real-world images, replicating cells morphological features relevant to identify hematological diseases and their clinical status. This approach based on widely available textual data allows effective data augmentation and effortless data sharing, thus accelerating and improving precision medicine research in hematology.

Disclosures: Santoro: Novartis: Speakers Bureau; Eisai: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Arqule: Other; Bayer: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Pfizer: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Sandoz: Speakers Bureau; Eli Lilly: Speakers Bureau; AstraZeneca: Speakers Bureau; Celgene (BMS): Speakers Bureau; Amgen: Speakers Bureau; Abbvie: Speakers Bureau; Roche: Speakers Bureau; Takeda: Speakers Bureau; Merck MSD: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Gilead: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Servier: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; BMS: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Incyte: Consultancy; Sanofi: Consultancy. Kordasti: Beckman Coulter: Honoraria; Novartis: Honoraria, Membership on an entity's Board of Directors or advisory committees; MorphoSys: Research Funding. Santini: BMS, Abbvie, Geron, Gilead, CTI, Otsuka, servier, janssen, Syros: Membership on an entity's Board of Directors or advisory committees. Platzbecker: Silence Therapeutics: Consultancy, Honoraria, Research Funding; Takeda: Consultancy, Honoraria, Research Funding; Bristol Myers Squibb: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Other: travel support; medical writing support, Research Funding; Servier: Consultancy, Honoraria, Research Funding; Janssen Biotech: Consultancy, Research Funding; Syros: Consultancy, Honoraria, Research Funding; Merck: Research Funding; Curis: Consultancy, Research Funding; Geron: Consultancy, Research Funding; Roche: Research Funding; BeiGene: Research Funding; BMS: Research Funding; MDS Foundation: Membership on an entity's Board of Directors or advisory committees; AbbVie: Consultancy; Novartis: Consultancy, Honoraria, Research Funding; Celgene: Honoraria; Jazz: Consultancy, Honoraria, Research Funding; Fibrogen: Research Funding; Amgen: Consultancy, Research Funding. Diez-Campelo: Gilead Sciences: Other: Travel expense reimbursement; GSK: Consultancy, Membership on an entity's Board of Directors or advisory committees; Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees; BMS/Celgene: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Other: Advisory board fees. Fenaux: AbbVie: Consultancy, Honoraria, Research Funding; Jazz: Consultancy, Honoraria, Research Funding; Janssen: Consultancy, Honoraria, Research Funding; French MDS Group: Honoraria; Novartis: Consultancy, Honoraria, Research Funding; Bristol Myers Squibb: Consultancy, Honoraria, Research Funding. Zeidan: Novartis: Consultancy, Honoraria; Boehringer-Ingelheim: Consultancy, Honoraria; Incyte: Consultancy, Honoraria; Agios: Consultancy, Honoraria; Servier: Consultancy, Honoraria; Seattle Genetics: Consultancy, Honoraria; Amgen: Consultancy, Honoraria; Janssen: Consultancy, Honoraria; Genentech: Consultancy, Honoraria; Zentalis: Consultancy, Honoraria; Astex: Research Funding; Shattuck Labs: Research Funding; Syros: Consultancy, Honoraria; Lox Oncology: Consultancy, Honoraria; ALX Oncology: Consultancy, Honoraria; Orum: Consultancy, Honoraria; Notable: Consultancy, Honoraria; BioCryst: Consultancy, Honoraria; Takeda: Consultancy, Honoraria; Ionis: Consultancy, Honoraria; BeyondSpring: Consultancy, Honoraria; Otsuka: Consultancy, Honoraria; Epizyme: Consultancy, Honoraria; Syndax: Consultancy, Honoraria; Gilead: Consultancy, Honoraria; Kura: Consultancy, Honoraria; Chiesi: Consultancy, Honoraria; Mendus: Consultancy, Honoraria; Tyme: Consultancy, Honoraria; Schrödinger: Consultancy, Honoraria; Regeneron: Consultancy, Honoraria; Foran: Consultancy, Research Funding; Taiho: Consultancy, Honoraria; Geron: Consultancy, Honoraria; Astellas: Consultancy, Honoraria; Daiichi Sankyo: Consultancy, Honoraria; Jazz: Consultancy, Honoraria; Celgene/BMS: Consultancy, Honoraria; Pfizer: Consultancy, Honoraria; AbbVie: Consultancy, Honoraria. Haferlach: MLL Munich Leukemia Laboratory: Current Employment, Other: Equity Ownership. Della Porta: Bristol Myers Squibb: Honoraria, Membership on an entity's Board of Directors or advisory committees.

*signifies non-member of ASH