Artificial Intelligence-Enhancement of Flow Cytometry Data Accelerates the Identification of Minimal Residual Chronic Lymphocytic Leukemia

Chiu, April

Background: Flow cytometry (FC) is widely utilized for the identification of minimal residual disease (MRD) in chronic lymphocytic leukemia (CLL), providing an essential laboratory indicator for prognostic assessment and therapeutic management. Unfortunately, the high level of expertise required, long manual analysis time, and complexity of FC data files limit the offering of this testing to selected reference laboratories.

Methods: We developed an artificial intelligence (AI) pipeline to automatically enhance unaltered FC files, minimizing the expertise and manual analysis time needed to detect CLL MRD on a single-tube 10-color panel (CD5, CD19, CD20, CD22, CD38, CD43, CD45, CD200, kappa and lambda). Raw FC files corresponding to peripheral blood or bone marrow specimens from 166 CLL MRD positive cases (MRD < 5%, median = 0.17%, 122 patients), and 61 MRD-negative cases (45 patients) were processed through our CCADDAS (Clustering and Classification of All events, Dimensionality reduction, Downsampling and Aberrancy Scaling) pipeline on a cloud environment (Google Vertex AI). Automated processing steps included elimination of acquisition errors (FlowCut), state-of-the-art clustering (PARC), dimensionality reduction (UMAP), cluster-based anomaly detection compared to negative controls (15 bone marrow aspirates and 14 peripheral bloods), and cluster-informed downsampling with preservation of low-event subsets. In addition, a deep neural network (DNN) trained on expert-defined subpopulations from negative controls was included for automatic gating of normal subsets (Tensorflow v2). AI-enhanced data files were analyzed using a general purpose flow cytometry analysis software (Kaluza v 3.5, Beckman Coulter), and %MRD estimates compared to reported results based on expert analysis of original FC files.

Results: Cluster-informed downsampling reduced the number of cells per case to be manually analyzed from 1.1 million to 165,010 cells on average (85% cellularity reduction); resulting in a smaller FC data file (from 62.1 MB to 15.2 MB; 75% data reduction). Importantly, low-level MRD events were adequately preserved after downsampling (median 100% retention for both MRD<0.01% and MRD 0.01-1%). True number of events were accurately estimated on gated subsets using an “upsampling factor” parameter [observed # events x mean (upsampling factor)]. Gating of normal subsets was completely automated using a DNN classifier parameter. Rapid identification of suspected MRD was aided by an AI-generated “aberrancy scale” parameter that discriminated CLL MRD (mean ± 2SD: 969 ± 144) from background B-lymphoid cells (mean ± 2SD: 395 ± 401) (p<0.0001) (AUC = 0.98), with a performance superior to CD5 (AUC = 0.93), CD20 (AUC = 0.83), CD43 (AUC = 0.8) and CD200 (AUC = 0.7). Immunoglobulin light chain-positive CLL MRD was visually identified on a pre-calculated UMAP plot as a single-colored cluster, after dual coloring for kappa and lambda. The use of AI-enhanced files reduced manual analysis time from 11.6 minutes (SD = 4.5) to 0.8 minutes (SD = 0.4) per case, on average (p < 0.01) (93% reduction). CLL MRD was detected in all positive cases above clinically relevant threshold (≥0.01%, as recommended by NCCN guidelines), and in 17 of 21 (81%) cases below this threshold, with excellent quantitative correlation with conventional analysis on original FC files (linear regression slope = 1.02, intercept = 0.001%, R squared = 0.98, P < 0.0001).

Conclusion: We introduce a largely unsupervised AI pipeline that transforms raw CLL-MRD FC data into a markedly smaller, AI-annotated and software-agnostic FC file, including comparison to normal controls. This CCADDAS pipeline simplifies and accelerates detection of CLL-MRD in clinical diagnostics, reducing number of cells analyzed by 85% and manual analysis time by 93%, without impacting test performance. Moreover, CCADDAS can be utilized for any laboratory-developed CLL-MRD assay, and its small-sized export is compatible with any clinical FC software, computing platform and analysis strategy. Adoption of CCADDAS is likely to facilitate the implementation of CLL MRD FC analysis by more clinical laboratories.

1858 Artificial Intelligence-Enhancement of Flow Cytometry Data Accelerates the Identification of Minimal Residual Chronic Lymphocytic Leukemia