-Author name in bold denotes the presenting author
-Asterisk * with author name denotes a Non-ASH member
Clinically Relevant Abstract denotes an abstract that is clinically relevant.

PhD Trainee denotes that this is a recommended PHD Trainee Session.

Ticketed Session denotes that this is a ticketed session.

3709 High Prevalence of Subjective Minimizing Language in Clinical Trials of Hematologic Malignancies: Natural Language Processing (NLP) Validation Study and Systematic Review of Randomized Controlled Trials Presented at ASH 2009-2021

Program: Oral and Poster Abstracts
Session: 902. Health Services and Quality Improvement - Lymphoid Malignancies: Poster II
Hematology Disease Topics & Pathways:
Research, clinical trials, artificial intelligence (AI), adult, Clinical Practice (Health Services and Quality), Clinical Research, health outcomes research, pediatric, patient-reported outcomes, real-world evidence, Adverse Events, young adult , Technology and Procedures, Study Population, Human, machine learning, natural language processing
Sunday, December 10, 2023, 6:00 PM-8:00 PM

Benjamin Chin-Yee, MD, MA1,2*, Tiancheng Hu, MSc^3*, Clarissa Skorupski, MD4*, Sarah Ghnaim, MD5*, Bishal Gyawali, PhD, MD6*, Turab Mohammed, MD7*, James Yu, MD8*, Gary H. Lyman, MD9,10, Michelle Sholzberg, MD11,12, Lisa Hicks, MD11,12 and Nicole M. Kuderer, MD, MSc#13

1Division of Hematology, Department of Medicine, Western University, London, ON, Canada
2Department of History and Philosophy of Science, University of Cambridge, Cambridge, United Kingdom
3Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, United Kingdom
4Department of Medicine, University of Toronto, Toronto, ON, CAN
5Department of Medicine, Western University, London, Canada
6Department of Oncology, Queen's University, Kingston, Canada
7Moffit Cancer Center, Tampa, FL
8Division of Hematology and Medical Oncology, H. Lee Moffit Cancer Center, Tampa, FL
9Fred Hutchinson Cancer Center, Seattle, WA
10Department of Medicine, University of Washington School of Medicine, Seattle, WA
11Li Ka Shing Knowledge Institute, University of Toronto, Toronto, ON, Canada
12Division of Hematology-Oncology, St. Michael's Hospital, Toronto, ON, Canada
13Advanced Cancer Research Group, Seattle, WA

Background: Transparent and objective reporting of treatment toxicities in cancer clinical trials is critical to inform patient-centred, shared decision-making. Previous studies have shown that toxicity reporting is inconsistent and incomplete in randomized controlled trials (RCTs) presented at major conferences in both hematologic (Chin-Yee et al. 2022; Skorupski et al. 2022) and gastrointestinal malignancies (Yu et al. 2023). The objectives of this study were to validate an NLP-based algorithm to identify subjectively minimized toxicity language in conference abstracts, and to evaluate longitudinal changes in the prevalence of minimized language in RCTs presented at ASH.

Methods: For NLP models, data from prior systematic reviews of RCTs presented at ASH 2017-2021 were used as development (RCTs in acute leukemia: Chin-Yee et al. 2022) and validation (multiple myeloma and lymphoma: Skorupski et al. 2022) datasets. Because subjective minimizing language usually exhibits limited variability, we adopted a dictionary-based approach that is highly interpretable. Two dictionaries were developed: the first to identify subjective minimizing toxicity language; the second to identify reporting of patient experiences through Patient-Reported Outcomes (PROs) or Quality-of-Life (QOL) measures. Primary minimizing terms were defined as: “tolerable”, “manageable”, “acceptable”, and “favorable”; secondary minimizing terms were: “feasible”, “safe”, and “limited” (Chin-Yee et al. 2022). The primary outcome was F1 score (summary statistic of accuracy and precision) for identification of primary minimizing language. Based on F1 score in the development set, we operationalized our dictionary to include 3 primary minimizing terms “tolerable”, “manageable”, and “acceptable” (including relevant variants), while dropping “favorable” and all secondary minimizing terms. Precision, recall, F1 score, and accuracy were calculated for both dictionaries in each dataset (see Table 1 for definitions). Validated dictionaries for minimizing terms and for PRO/QOL measures were subsequently applied in a systematic review of RCT abstracts at ASH from 2009-2021 (representing the available time period indexed in Embase) across 3 diseases (acute leukemia, myeloma, and lymphoma) to assess changes in use of subjective minimizing language and reporting of PROs or QOL measures over a priori defined 3 major time periods: earliest available, middle, and most recent. Study inclusion/exclusion criteria are described previously (Chin-Yee et al. 2022).

Results: Study characteristics are reported in Table 1A for NLP development and validation sets. Our dictionary-based method showed a precision of 0.90, recall of 0.82, F1 of 0.86, and accuracy of 0.93 in the development set, values considered sufficient for validation of the NLP model. In the validation set, these values were 0.75, 0.75, 0.75, and 0.82, respectively. This NLP model was applied to evaluate RCTs presented at ASH from 2009-2021 across the 3 diseases (acute leukemia, myeloma, and lymphoma). Following abstract screening, inclusion criteria were met in 68 of 411, 82 of 443, and 82 of 581 acute leukemia, myeloma, and lymphoma RCTs, respectively. NLP-assessed subjective minimization was present in 89 (26.4%) of all studies from 2009-2021; and in 26 (22.0%) RCTs from 2009-2012, 29 (25.4%) RCTs from 2013-2016, 34 (32.3%) RCTs from 2017-2021 (Table 1B). Time-series analysis and results on PROs and QOL measures will also be presented at ASH.

Discussion: NLP provides a novel, systematic, and scalable approach for evaluating use of subjective minimizing language in clinical trials. Our model showed good accuracy in identifying primary minimizing terms in RCTs presented at ASH from 2017-2021. Applying this model across a broader time period (2009-2021) suggests that use of subjectively minimized toxicity language is common in RCTs at ASH, similar across lymphoid and myeloid malignancies, and potentially increasing over time. NLP could provide an effective means of evaluating conference abstracts on a large scale with the ultimate goal of improving the objectivity of toxicity reporting in clinical trials. In conclusion, our findings suggest that the use of minimizing language is common and may be increasing over time, a concerning finding that implies downplaying of treatment toxicity by investigators.

^co-first and #co-last authors

Disclosures: Lyman: Amgen: Research Funding; Beyond Spring: Consultancy; G1 Therapeutics: Consultancy; Partner Therapeutics: Consultancy; Samsung Bioepis: Consultancy; Merck: Consultancy; Jazz: Consultancy; TEVA: Consultancy; Squibb: Consultancy; Sandoz: Consultancy; Seattle Genetics: Consultancy; Fresenius Kabi: Consultancy. Sholzberg: CSL Behring: Research Funding; Pfizer: Honoraria, Research Funding; Octapharma: Honoraria, Research Funding. Kuderer: Astra Zeneca: Consultancy; Janssen: Consultancy; Pfizer: Consultancy; BMS: Consultancy; Beyond Springs: Consultancy; G1 Therapeutics: Consultancy; Sandoz: Consultancy; Seattle Genetics: Consultancy; Fresenius: Consultancy.

*signifies non-member of ASH