Type: Oral
Session: 803. Emerging Tools, Techniques and Artificial Intelligence in Hematology: Emerging Technologies for Understanding Benign and Malignant Hematology
Hematology Disease Topics & Pathways:
Research, Translational Research
To focus on specific HLA/TCR interactions we have analyzed HLA for the presence of common haplotypes. First, we generated HLA haplotype calls using whole-exome sequencing data. We have found that HLA-A*02 was the most common class I allele, but not overrepresented in LGL vs controls. We have also identified HLA-B*44 as a risk allele present in 36% LGL with STAT3 mutation which was further enriched when considering HLA-A*02/HLA-B*44 together (Fig.1A).
After TCR NGS we selected a threshold of 5.0% (based on study of 900 controls) to define an expanded (co)dominant clonotypes: 145 expanded clones were identified, of which 143 were unique/private. To identify possible consensus sequence across the CDR3 profile, we clustered the CDR3 AA sequences based on global alignment scores (Needleman-Wunsch) using BLOSUM62 substitution matrix and identified 9 clusters. However, multiple-sequence alignment within each CDR3 cluster did not show any distinct consensus sequence patterns overall and within cases sharing HLA-A*02. Cross-referencing with CDR3 databases of known conditions yielded 87 CDR3β (co)dominant LGL clonotypes. About 50% of those LGL clonotypes were shared in 2 common disease: type I diabetes, Celiac disease We also mined the VDJdb[2] for known CDR3-pHLA associations. Filtering to include high quality annotations, matches and HLA-A*02 restriction; LGL-derived TCR were found to recognize multiple epitopes. Prominent examples include CMV and EBV derived peptidic antigens from pp65, BRLF1, and BMLF1 (NLVPMVATV, RPPIFIRRL) present in 44% LGL expanded clonotypes (Fig.1B).
In order to reverse engineer the identity of antigenic peptides based on the CDR3β sequence and HLA type (high affinity peptides restricted to HLA A2, a machine-learning method was developed as predictive modelling tool that quantifies CDR3β/peptide binding. The previously developed LLM trained on AA sequences was used for that purpose[3] whereby LLM embeddings serve as input to a deep CNN-based model, which is trained on IEDB pos/neg datasets[4]. This method was augmented with control CDR3 sequences as negative data points. As a result, our model showed an AUC of 0.94 in cross-validation. As a proof of concept of the proposed model in epitope discovery, CMV pp65-derived peptides were fitted into specific LGL CDR3 and CMV specific CDR3 clonotypes were found to match e.g., pp65 epitope NLVPMVATV among several others overlapping with the above-described findings.
In conclusion, our analysis shows that a significant number of expanded clonotypes may be derived from known epitopes e.g., in autoimmune disease and viral infection pointing towards molecular mimicry. However, the specificity of a large fraction of the LGL spectrum remains unknown. For the discovery of corresponding peptides we generated ML-based antigen simulation method allowing to identify the best fitting peptides into the corresponding HLA allele and CDRVβ clonotype.
Fig. Global map of co-occurrence of HLA allotypes showing the delta frequency compared to healthy controls and Sankey plot of identified CDR3β – Epitope associations (a) Using the whole-exome sequencing based HLA genotype calls, we calculated the frequency of co-occurrence of different HLA loci and investigated the differences in regards to controls (b) Using the VDJdb, we conducted sequence similarity based search and extracted the CDR3-pHLA annotations.
Disclosures: Maciejewski: Alexion: Membership on an entity's Board of Directors or advisory committees; Omeros: Consultancy; Novartis: Honoraria, Speakers Bureau; Regeneron: Consultancy, Honoraria.
See more of: Oral and Poster Abstracts