Type: Oral
Session: 503. Clonal Hematopoiesis, Aging, and Inflammation: Causes and Consequences
Hematology Disease Topics & Pathways:
Research, Fundamental Science, Translational Research, CHIP, Genomics, Lymphoid Malignancies, Myeloid Malignancies, Biological Processes, Study Population, Human
Clonal hematopoiesis (CH), the expansion of somatic mutation-bearing hematopoietic stem cell (HSC) clones, is associated with an increased risk of hematological neoplasms and non-malignant diseases. Most of what we know about CH has come from studies of clones driven by putative driver mutations (CH-PD), but less is known about another common form of CH, CH with unknown drivers (CH-UD). Here, we develop a machine learning-based approach to detect CH-UD from whole-genome sequencing data and use this to decipher the risk factors, genetic determinants and phenotypic consequences of CH-UD, by studying the largest whole-genome sequenced cohort to date.
Methods:
We studied 409,553 European participants from the UK Biobank with both whole-exome and -genome sequencing (WES and WGS) data available, and with no prior diagnosis of hematological malignancies. Using variant calls from blood WGS, we trained a machine learning classifier on 78,794 individuals with known CH (CH-PD or mosaic chromosomal alterations, mCA) and 36,854 young individuals (<45yrs) with no evidence of CH. The classifier examined 120 genetic features across six categories, namely passenger mutation burden, variant type (SNV, indels), signature (single- and tri-nucleotide context), position, and tolerance. A genome-wide association study (GWAS) of CH-UD was performed using imputed SNP array data and a gene-level association analysis using WES.
Results:
To date, CH-UD has been identified by the presence of a sufficient number of putative somatic passenger mutations. Beyond the presence of such mutations, our classifier revealed additional features predictive of CH-UD. Specifically, enrichment for SNVs over indels, enrichment for C>T transitions and depletion of C>G transversions, enrichment for mutations at HSC open chromatin regions, and enrichment for mutations at mutation-tolerant sites. Our classifier demonstrated improved performance for identifying CH-UD over using solely the number of passenger mutations, particularly for identifying cases of CH-UD with large (≥40%) clonal fractions (area under curve: 89% vs 81%, sensitivity: 80% vs. 63%, specificity: 98% for both).
We next applied our classifier to detect CH-UD in the remaining 330,759 individuals without CH-PD or mCA, and identified 22,469 (5.49%) with CH-UD. CH-UD was strongly associated with age (odds ratio (OR) [95% CI] = 1.084 [1.081,1.086], P < 2.2x10-16), reaching a frequency of 8.44% in those aged 70. Additionally, a dose-dependent association with smoking was observed (previous smoker: OR = 1.39 [1.35,1.43], P = 4.67x10-104; current smoker: 2.19 [2.10,2.29], P = 1.21x10-274). An association with daily alcohol intake was also observed (OR = 1.15 [1.06,1.25], P = 4.32x10-4) while longer genetically-predicted telomere length was also associated with increased risk of CH-UD (OR = 1.044 [1.029,1.058], P = 1.30x10-9).
A GWAS identified 52 loci associated with CH-UD, of which 24 were novel and not previously linked to CH-UD or CH-PD. Novel loci included genes recurrently mutated in myeloid neoplasms, namely MECOM (P = 9.80x10-17), PTPN11 (P = 3.67x10-8), CEBPA (P = 6.38x10-11), and NFE2 (P = 1.31x10-13). Gene-burden analysis, which aggregates rare germline variants at the gene-level, identified CHEK2 variants to be associated with risk of CH-UD (OR = 1.58 [1.37,1.83], P = 2.74x10-9).
Lastly, we investigated CH-UD disease associations and identified increased risk of both myeloid neoplasms (hazard ratio (HR) = 3.90 [3.37,4.53], P = 3.06x10-73) and lymphoid neoplasms (HR = 1.48 [1.33,1.64], P = 2.67x10-13). In comparison, CH-PD was associated with higher risk of myeloid neoplasm (HR = 7.04 [6.16,8.06], P = 2.41x10178) but lower risk of lymphoid neoplasm (HR = 1.27 [1.12,1.45], P = 2.00x10-4) relative to CH-UD.
Conclusions:
We develop a new classifier that improved detection of CH-UD and apply it to the largest whole-genome sequenced cohort to date. Using our findings we search for genome-wide genetic associations of CH-UD. We find that some of these are shared with CH-PD, but most (including 24 novel loci identified here) are specifically associated with CH-UD. Our findings propose that CH-UD may have some similarities with CH-PD, but also many differences in its pathogenesis and warrants specific investigation to better understand its own causes, consequences and relationship to ageing and CH-related diseases.
Disclosures: Wen: AstraZeneca: Current Employment. Karpinski: AstraZeneca: Current Employment. Vitsios: AstraZeneca: Current Employment. Wasilewski: AstraZeneca: Current Employment, Current equity holder in publicly-traded company. Petrovski: AstraZeneca: Current Employment. Harper: AstraZeneca: Current Employment. Fabre: AstraZeneca: Current Employment. Vassiliou: STRM.BIO: Consultancy; AstraZeneca: Research Funding. Mitchell: AstraZeneca: Current Employment.
See more of: Oral and Poster Abstracts