-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

1715 Machine Learning-Based Survival Analysis Reveals Prognostic Clinical and Genetic Insights for Patients with Cutaneous T-Cell Lymphoma

Program: Oral and Poster Abstracts
Session: 624. Hodgkin Lymphomas and T/NK Cell Lymphomas: Clinical and Epidemiological: Poster I
Hematology Disease Topics & Pathways:
Research, artificial intelligence (AI), epidemiology, Lymphomas, non-Hodgkin lymphoma, Clinical Research, genomics, bioinformatics, T Cell lymphoma, Diseases, Lymphoid Malignancies, survivorship, Biological Processes, Technology and Procedures, machine learning
Saturday, December 9, 2023, 5:30 PM-7:30 PM

Celine M. Schreidah1*, David M. DeStephano, MPH2*, Samuel S. Pan, MSc2*, Shikun Wang, PhD2*, Haoyang Shen, MS3*, Casey N. Ta, PhD4*, George Bingham Reynolds, MS3*, Lauren M. Fahmy, BS1, Emily R. Gordon, BA1*, Oluwaseyi Adeuyan, BS1*, Bradley D. Kwinta, MD1*, Connor J. Stonesifer, MD5*, Warren H. Chan, MD, MS6*, Jaehyuk Choi, MD, PhD7, Madeleine Duvic, MD8*, Fernando Gallardo, MD, PhD9*, Michael Girardi, MD10*, Joan Guitart, MD7*, Youn H. Kim, MD11, Michael S. Khodadoust, MD11*, Safa Najidh, MD12*, Xiao Ni, MD, PhD8*, Ramon M. Pujol, MD, PhD9*, Cornelis P. Tensen, PhD12*, Maarten H. Vermeer, MD, PhD12*, Sean Whittaker, MD13*, Nicholas P. Tatonetti, PhD3,4,14*, Herbert S. Chase, MD, MA4*, Itsik Pe'er, PhD3,15,16* and Larisa J. Geskin, MD2,17*

1Columbia University Vagelos College of Physicians and Surgeons, New York, NY
2Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY
3The Data Science Institute, Columbia University, New York, NY
4Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY
5Dr. Phillip Frost Department of Dermatology and Cutaneous Surgery, University of Miami Miller School of Medicine, Miami, FL
6Department of Dermatology, Icahn School of Medicine at Mount Sinai, New York, NY
7Department of Dermatology, Northwestern University Feinberg School of Medicine, Chicago, IL
8MD Anderson Cancer Center, The University of Texas, Houston, TX
9Department of Dermatology, Hospital del Mar-Parc de Salut Mar, Barcelona, Spain
10Department of Dermatology, Yale School of Medicine, New Haven, CT
11Departments of Dermatology and Medicine - Oncology, Stanford University School of Medicine, Stanford, CA
12Department of Dermatology, Leiden University Medical Centre, Leiden, Netherlands
13St John's Institute of Dermatology, Guy’s and St Thomas’ NHS Foundation Trust, London, GBR
14Department of Computational Biomedicine and Cedars-Sinai Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA
15Department of Computer Science, Columbia University, New York, NY
16Department of Systems Biology, Columbia University, New York, NY
17Department of Dermatology, Columbia University Irving Medical Center, New York, NY

Introduction: Cutaneous T-cell lymphomas (CTCL) are heterogeneous lymphoproliferative disorders on a spectrum of disease presentation and severity. Around two-thirds of cutaneous T-cell lymphomas can be classified as mycosis fungoides (MF) or Sézary syndrome (SS). While advanced stages of MF and SS are associated with decreased survival and worse outcomes, even early-stage patients can possess a variable course. Numerous deep sequencing studies have fallen short in identifying genetic abnormalities that drive disease pathogenesis and predict prognosis. Large-cell transformation and elevated lactate dehydrogenase levels are associated with worse prognosis in SS; however, such features cannot accurately prognosticate patient survival. There is a need for investigation that may assist in the prognostication of survival in patients with CTCL. Machine learning methods may help to elucidate correlations between clinical and genetic factors to predict disease progression and outcomes.

Objective: Using integrated clinical and genomic data from six international sequencing studies, this investigation aimed to perform survival analysis to identify both clinical and genetic features of survival outcomes using artificial intelligence/machine learning methods.

Methods: A total of 126 eligible patients were identified, of which 99 had sufficient clinical data and 88 had sufficient clinical and genetic data. Poisson distribution was used to assess genomic data and significant genetic abnormalities for each individual patient were linked with their corresponding clinical outcomes. Genetic inputs included mutational data at a frequency of greater than or equal to four dataset occurrences. Multiple imputation using a random forest model was applied to all included variables (rate of missingness <10%). Overall survival was assessed using three separate cox models fit with patient clinical, laboratory, or treatment covariates.

Ten-fold cross validated Least Absolute Shrinkage and Selection Operator (LASSO) was applied using adaptive regularization, and an iterated approach using 100 ten-fold cross validation repeats (resampling validation) to select the lambda value with the highest average C-index. The one-standard-error rule was used for the adaptive LASSO, while the iterated LASSO penalty was relaxed by 0.1. Prior to adjusting the lambdas, the best performing C-index was 0.73 and 0.64 for the adaptive LASSO and iterated LASSO, respectively. Following adjustment, the C-indexes were 0.71 and 0.60. To elucidate genetic candidates, we performed genome-wide association studies (GWAS) which used the false discovery rate (FDR) multiple comparisons correction and adjusted for the first three principal components in a principal component analysis (PCA) that included the imputed non-mutational covariates.

Results: We have used standard statistical and machine learning methodologies to elucidate prognostic factors in MF and SS. Using standard cox regression analysis and in agreement with prior investigations, our investigation showed significant associations for age at diagnosis (hazard ratio = 1.06, P<0.001), stage at sampling (hazard ratio =1.99, P=0.007), and lymph node involvement at diagnosis (hazard ratio = 4.59, P<0.001).

For the first time, using PCA-adjusted GWAS and iterated and adaptive LASSO, we demonstrate the association of mutated genes with survival in patients with MF and SS (e.g. most significant GWAS hazard ratio of 0.17, P=0.007, FDR=0.35). Moreover, several mutated genes were associated with particularly poor outcomes and high mortality (e.g. most significant GWAS hazard ratio of 5.26, P=0.003, FDR=0.63). Mutated genes with the highest or lowest magnitude using LASSO effect estimates in GWAS (lowest P values) were highly associated with survival outcomes. When present, the mutated genes carried significant prognostic implications for these patients.

Conclusions: Taken together, we demonstrate the potential of machine learning and artificial intelligence methodologies for investigation of novel genetic associations with survival prognostication. Future investigations are needed to validate our findings in prospective studies.

Disclosures: Choi: Moonlight Bio: Current equity holder in private company, Membership on an entity's Board of Directors or advisory committees, Other: Co-founder, Patents & Royalties. Kim: Eisai: Research Funding; Citius: Research Funding; Kyowa Kirin: Research Funding; Innate: Research Funding; Corvus: Research Funding; Trillium: Research Funding; Elorac: Research Funding; CRISPR Therapeutics: Research Funding; Takeda: Research Funding; Drenbio: Research Funding. Khodadoust: CRISPR Theraputics: Research Funding; Daiichi Sankyo: Membership on an entity's Board of Directors or advisory committees; Nutcracker Theraputics: Research Funding.

*signifies non-member of ASH