Session: 901. Health Services Research—Non-Malignant Conditions: Poster III
Hematology Disease Topics & Pathways:
Diseases, Genetic Disorders, Clinically relevant
Methods: This study will be comprised of three parts. Part 1, a retrospective observational database analysis, will use data from the electronic patient database of the Maccabi Healthcare Service (MHS), the second largest Health Maintenance Organization in Israel. The MHS includes 2.2 million health records from 25% of the Israeli population. Clinical records have been fully computerized for >20 years and are fully integrated with automated central laboratory, digitized imaging and pharmacy purchase data. Patients with confirmed GD who have been enrolled in the MHS health plan for ≥1 year will be eligible for inclusion, with approximately 250 patients with GD expected to be enrolled. Using MHS data from patients with GD, the Gaucher Earlier Diagnosis Consensus (GED-C) scoring system, developed by a consensus panel using Delphi methodology on the signs and co-variables that may be important for the diagnosis of GD, will be evaluated and compared with alternative scores developed directly from clinical data based on supervised machine learning.
In Part 2, a clinical study, the best performing modeled scores from Part 1 will be applied to the MHS database to identify individuals who may have undiagnosed GD (‘GD suspects’). Samples for diagnostic testing (using a specific and sensitive biomarker (glucosylsphingosine, lyso-Gb1) followed by beta-glucocerebrosidase (GBA) genotyping for positive samples) will be collected from MHS biobank (for individuals who have consented). Individuals not participating in the biobank will be asked to provide a sample. This part of the study will evaluate the predictive value of the modeled scores, and assess the sensitivity and specificity of the model for the diagnosis of new patients with GD.
In Part 3, analysis of data from newly diagnosed patients identified in Part 2 will be used to develop machine learning models for the diagnosis of GD (Figure 1). Signs and co-variables included in the GED-C score will be used, eliminating features that are non-informative. Features will be quantitative where possible, and interaction terms will be added for age of onset and trend for key features. A number of methods will be developed, with the best performing, based on its precision at a given sensitivity level, being selected as the final model. External validation of the best identified model is planned, to ensure unbiased estimate of the model’s accuracy.
Discussion: The main goal of the study is to develop an algorithm to help detect patients with GD, independent of physicians’ ability to recognize signs and symptoms, using the application of machine learning to data from a large health database. The study is expected to result in a practical tool that will alert physicians to the possibility of GD. The resulting model will also improve our understanding of GD based on the relative importance of features for GD prediction. Such tools will have a positive impact on patient care and quality of life and on healthcare costs and may lead to a change in approach for diagnosing rare diseases.
Disclosures: Revel-Vilk: Takeda: Honoraria; sanofi-Genzyme: Honoraria; Pfizer: Honoraria. Chodick: Novartis Pharma AG: Other: Institutional grant. Gadir: Takeda: Current Employment.
See more of: Oral and Poster Abstracts