Validation of Dynamic Deep Learning for the Prediction of Cancer-Associated Thrombosis

Mantha, Simon

Background: Currently available models for the prediction of cancer-associated thrombosis (CAT) have limited performance and are intended for single-time use, typically at the start of chemotherapy. Furthermore, prolonged exposure to therapeutic anticoagulation may lead to unnecessarily increased risk of bleeding. Dynamic modelling offers the possibility of making multiple predictions over time as thrombosis risk changes, while a deep learning approach has the potential to take advantage of complex relationships between predictors and outcome.

Methods: The primary derivation cohort consisted of adult patients with a solid tumor malignancy from the Memorial Sloan Kettering (MSK) IMPACT cohort enrolled between 2014 and 2019. The CAT outcome was defined as the occurrence of lower extremity deep vein thrombosis or pulmonary embolism. Prior and incident CAT events were detected mostly through the use of the CEDARS+PINES natural language processing platform (Mantha et al, 2024), until the end of 2020. Training, validation and test sets were derived randomly in a 60:20:20 ratio, matching for event type. Using the Dynamic-DeepHit framework (Lee et al, 2019) as adapted to PyTorch (Jeanselme, 2024) and based on DeepSurvivalMachines (Nagpal et al, 2020), a neural network model was trained to predict incident CAT dynamically over time. Predictors included age, sex, cancer type, time from cancer diagnosis, follow-up time at the cancer center, time from last parenteral chemotherapy for eight pharmacological classes, presence of metastatic disease, total white blood cell count, hemoglobin, mean cellular volume, platelet count and all components of the complete metabolic profile. Hyperparameter tuning was done on the training set, using the validation set for early stopping and to select the best model. Final performance after rounds of optimization was assessed on the MSK test set. The main metric was Harrel’s C-index, which was assessed at clinically pertinent time horizons for each of the four contiguous 180-day periods during the first two years from cohort entry. The final model was validated on an external cohort from Harris Health System affiliated with Baylor College of Medicine (BCM) from 2011-2020 where CAT events were detected using a computable phenotype algorithm (Li et al, 2023).

Results: After exclusion of individuals with a prior CAT episode, ongoing anticoagulation or missing data, the MSK cohort included 23,353 patients and the BCM cohort included 7,466 patients, corresponding to 461,736 and 124,848 distinct predictor value sets for each cohort respectively. Both cohorts included adult patients with a broad range of solid tumor malignancies, although the external validation cohort included mostly patients from underserved communities. The proportion of individuals identifying as White, Black/African American, Asian and Other was 82%, 7%, 8% and 3% respectively in the MSK cohort, compared with 64%, 27%, 6% and 3% respectively in the BCM cohort (p<0.01). In the MSK cohort, 6% of patients identified as Hispanic or Latino, compared to 52% for BCM (p<0.01). In the MSK test set, the cross-period weighted mean for Harrel’s C-index was 0.78, 0.76, 0.76, 0.75, 0.72 and 0.70 for prediction time horizons of 7, 14, 21, 28, 84 and 168 days respectively. This is compared to values of 0.81, 0.80, 0.74, 0.73, 0.72 and 0.71 in the BCM cohort for the same time horizons. In this external test set, there were 7,293, 4,454, 2,734 and 1,690 evaluable patients for observation periods [0-180], [181-360], [361-540] and [541-720] respectively. Performance varied but was largely preserved over the first 2 years of follow-up.

Conclusions: Dynamic modelling using a deep learning approach can be applied successfully to predict CAT. In the current use case, model discrimination was at its highest using a 7-day time horizon, even though longer time spans also yielded potentially clinically useful risk estimates. When appropriately implemented, this dynamic model may lead to adaptive and shortened exposure of therapeutic anticoagulation when patient’s temporal risk profile exceeds a predetermined threshold. Generalization of the model to an external test set was satisfactory, despite significant racial and socioeconomic differences between the two groups and the labelling of CAT events with different algorithms. Additional work is required to further validate and implement this methodology.

4017 Validation of Dynamic Deep Learning for the Prediction of Cancer-Associated Thrombosis