High Prevalence of Subjective Minimizing Language in Clinical Trials of Hematologic Malignancies: Natural Language Processing (NLP) Validation Study and Systematic Review of Randomized Controlled Trials Presented at ASH 2009-2021

Chin-Yee, Benjamin

Oral and Poster Abstracts
902. Health Services and Quality Improvement - Lymphoid Malignancies: Poster II

Research, clinical trials, artificial intelligence (AI), adult, Clinical Practice (Health Services and Quality), Clinical Research, health outcomes research, pediatric, patient-reported outcomes, real-world evidence, Adverse Events, young adult , Technology and Procedures, Study Population, Human, machine learning, natural language processing

Benjamin Chin-Yee, MD, MA^1,2^*, Tiancheng Hu, MSc^³^*, Clarissa Skorupski, MD⁴^*, Sarah Ghnaim, MD⁵^*, Bishal Gyawali, PhD, MD⁶^*, Turab Mohammed, MD⁷^*, James Yu, MD⁸^*, Gary H. Lyman, MD^9,10, Michelle Sholzberg, MD^11,12, Lisa Hicks, MD^11,12 and Nicole M. Kuderer, MD, MSc#¹³

¹Division of Hematology, Department of Medicine, Western University, London, ON, Canada
²Department of History and Philosophy of Science, University of Cambridge, Cambridge, United Kingdom
³Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, United Kingdom
⁴Department of Medicine, University of Toronto, Toronto, ON, CAN
⁵Department of Medicine, Western University, London, Canada
⁶Department of Oncology, Queen's University, Kingston, Canada
⁷Moffit Cancer Center, Tampa, FL
⁸Division of Hematology and Medical Oncology, H. Lee Moffit Cancer Center, Tampa, FL
⁹Fred Hutchinson Cancer Center, Seattle, WA
¹⁰Department of Medicine, University of Washington School of Medicine, Seattle, WA
¹¹Li Ka Shing Knowledge Institute, University of Toronto, Toronto, ON, Canada
¹²Division of Hematology-Oncology, St. Michael's Hospital, Toronto, ON, Canada
¹³Advanced Cancer Research Group, Seattle, WA

Background: Transparent and objective reporting of treatment toxicities in cancer clinical trials is critical to inform patient-centred, shared decision-making. Previous studies have shown that toxicity reporting is inconsistent and incomplete in randomized controlled trials (RCTs) presented at major conferences in both hematologic (Chin-Yee et al. 2022; Skorupski et al. 2022) and gastrointestinal malignancies (Yu et al. 2023). The objectives of this study were to validate an NLP-based algorithm to identify subjectively minimized toxicity language in conference abstracts, and to evaluate longitudinal changes in the prevalence of minimized language in RCTs presented at ASH.

Methods: For NLP models, data from prior systematic reviews of RCTs presented at ASH 2017-2021 were used as development (RCTs in acute leukemia: Chin-Yee et al. 2022) and validation (multiple myeloma and lymphoma: Skorupski et al. 2022) datasets. Because subjective minimizing language usually exhibits limited variability, we adopted a dictionary-based approach that is highly interpretable. Two dictionaries were developed: the first to identify subjective minimizing toxicity language; the second to identify reporting of patient experiences through Patient-Reported Outcomes (PROs) or Quality-of-Life (QOL) measures. Primary minimizing terms were defined as: “tolerable”, “manageable”, “acceptable”, and “favorable”; secondary minimizing terms were: “feasible”, “safe”, and “limited” (Chin-Yee et al. 2022). The primary outcome was F1 score (summary statistic of accuracy and precision) for identification of primary minimizing language. Based on F1 score in the development set, we operationalized our dictionary to include 3 primary minimizing terms “tolerable”, “manageable”, and “acceptable” (including relevant variants), while dropping “favorable” and all secondary minimizing terms. Precision, recall, F1 score, and accuracy were calculated for both dictionaries in each dataset (see Table 1 for definitions). Validated dictionaries for minimizing terms and for PRO/QOL measures were subsequently applied in a systematic review of RCT abstracts at ASH from 2009-2021 (representing the available time period indexed in Embase) across 3 diseases (acute leukemia, myeloma, and lymphoma) to assess changes in use of subjective minimizing language and reporting of PROs or QOL measures over a priori defined 3 major time periods: earliest available, middle, and most recent. Study inclusion/exclusion criteria are described previously (Chin-Yee et al. 2022).

Results: Study characteristics are reported in Table 1A for NLP development and validation sets. Our dictionary-based method showed a precision of 0.90, recall of 0.82, F1 of 0.86, and accuracy of 0.93 in the development set, values considered sufficient for validation of the NLP model. In the validation set, these values were 0.75, 0.75, 0.75, and 0.82, respectively. This NLP model was applied to evaluate RCTs presented at ASH from 2009-2021 across the 3 diseases (acute leukemia, myeloma, and lymphoma). Following abstract screening, inclusion criteria were met in 68 of 411, 82 of 443, and 82 of 581 acute leukemia, myeloma, and lymphoma RCTs, respectively. NLP-assessed subjective minimization was present in 89 (26.4%) of all studies from 2009-2021; and in 26 (22.0%) RCTs from 2009-2012, 29 (25.4%) RCTs from 2013-2016, 34 (32.3%) RCTs from 2017-2021 (Table 1B). Time-series analysis and results on PROs and QOL measures will also be presented at ASH.

Discussion: NLP provides a novel, systematic, and scalable approach for evaluating use of subjective minimizing language in clinical trials. Our model showed good accuracy in identifying primary minimizing terms in RCTs presented at ASH from 2017-2021. Applying this model across a broader time period (2009-2021) suggests that use of subjectively minimized toxicity language is common in RCTs at ASH, similar across lymphoid and myeloid malignancies, and potentially increasing over time. NLP could provide an effective means of evaluating conference abstracts on a large scale with the ultimate goal of improving the objectivity of toxicity reporting in clinical trials. In conclusion, our findings suggest that the use of minimizing language is common and may be increasing over time, a concerning finding that implies downplaying of treatment toxicity by investigators.

^co-first and #co-last authors

Disclosures: Lyman: Amgen: Research Funding; Beyond Spring: Consultancy; G1 Therapeutics: Consultancy; Partner Therapeutics: Consultancy; Samsung Bioepis: Consultancy; Merck: Consultancy; Jazz: Consultancy; TEVA: Consultancy; Squibb: Consultancy; Sandoz: Consultancy; Seattle Genetics: Consultancy; Fresenius Kabi: Consultancy. Sholzberg: CSL Behring: Research Funding; Pfizer: Honoraria, Research Funding; Octapharma: Honoraria, Research Funding. Kuderer: Astra Zeneca: Consultancy; Janssen: Consultancy; Pfizer: Consultancy; BMS: Consultancy; Beyond Springs: Consultancy; G1 Therapeutics: Consultancy; Sandoz: Consultancy; Seattle Genetics: Consultancy; Fresenius: Consultancy.

See more of: 902. Health Services and Quality Improvement - Lymphoid Malignancies: Poster II
See more of: Oral and Poster Abstracts

<< Previous Abstract | Next Abstract >>

^*signifies non-member of ASH

3709 High Prevalence of Subjective Minimizing Language in Clinical Trials of Hematologic Malignancies: Natural Language Processing (NLP) Validation Study and Systematic Review of Randomized Controlled Trials Presented at ASH 2009-2021