Session: 803. Emerging Diagnostic Tools and Techniques: Poster I
Hematology Disease Topics & Pathways:
Leukemia, Diseases, Lymphoma (any), Biological Processes, DNA damage, DNA repair, Technology and Procedures, Lymphoid Malignancies, Myeloid Malignancies, genomics, molecular testing, NGS, RNA sequencing, pathogenesis, WGS
Methods: Random forest models were generated with the randomForest package in R, and then tuned using the R caret package. Training data sets consisted of fusion calls deemed true by review and by orthogonal methods including PCR/Sanger sequencing and the commercial Archer™ fusion calling system. We present the results of training on calls made by five fusion callers Arriba, STAR-Fusion, FusionCatcher, deFuse, and Kallisto/pizzly. Logistic training variables (seen vs not seen by the fusion caller) were used for the five callers. Variables also included metrics for the magnitude and balance of coverage on either side of candidate fusion breakpoints reported by Arriba and STAR Fusion (“coverage balance”) and a single metric consisting of the number of sequencing reads that cross the candidate breakpoint. The model was validated by 10-fold cross-validation on 598 fusion calls by the five callers.
Results: The resulting model is superior to the simple strategy of requiring agreement by n of five callers, particularly with regard to specificity (Table 1). Also, “importance of variables,” reported by randomForest, gauges the relative contribution of variables in the model. Here it shows that one caller, Kallisto\pizzly, does not contribute to the model (Table 2).
Conclusion: Random Forest modeling provides a viable means of combining gene fusion call data from multiple callers into a single fusion calling tool with improved performance over simple combinations of fusion calls. An additional benefit is seen in that building and evaluating such models can guide the selection of fusion callers, thereby eliminating non-contributory calling methods and ensuring optimal utilization of computational resources.
Disclosures: Thomas: NeoGenomics,Inc.: Current Employment. Mou: NeoGenomics: Current Employment. Keeler: NeoGenomics: Current Employment. Magnan: NeoGenomics: Current Employment. Funari: NeoGenomics: Current Employment. Weiss: Merck: Other: Speaker; Bayer: Other: speaker; Genentech: Other: Speaker; NeoGenomics: Current Employment. Brown: NeoGenomics,Inc.: Current Employment. Agersborg: NeoGenomics: Current Employment.
See more of: Oral and Poster Abstracts