2023 HSR&D/QUERI National Conference

1194 — Race and COVID Severity among Veterans: Applications of Machine Learning

Lead/Presenter: Orna Intrator,  Canandaigua VAMC
All Authors: Cai S (University of Rochester& Canandaigua VAMC), Li, W (University of Rochester) Intrator, O. (University of Rochester& Canandaigua VAMC) Makineni, M. (University of Rochester& Canandaigua VAMC) Veazie,P. (University of Rochester& Canandaigua VAMC) Luo, J (University of Rochester& Canandaigua VAMC)

Older adults are at high risk of COVID-related severe outcomes including Intensive Care Utilization (ICU), death, and hospitalizations. Although studies have identified risk factors for COVID severity, it is unclear whether the effects of these factors vary by race. This study aimed to study severe outcomes among older Veterans diagnosed with COVID by 1) Using machine learning (ML) to predict severe COVID outcomes; 2) Comparing the most important predictors identified by machine learning between Blacks and Whites, and 3) Comparing machine learning methods with traditional regression approach.

Data included 2020 Veteran Health Administration COVID shared data, 2018-2020 Medicare claims, Minimum Data Set, and VA administrative data for Veterans receiving medical care in VA or community settings. Outcome variables included: death within 30 days of COVID diagnosis, hospitalization within 14 days of COVID diagnosis, and ICU use during the hospitalizations. We constructed a comprehensive list of individual risk factors (e.g., socio-demographic, diagnoses, and prior health care utilization). We conducted frequent pattern mining on each racial subgroup to select frequently appearing patterns of features for each outcome. We then developed 4 types of models, including standard logistic regression (LR), LR with sequential forward selection, decision tree (DT), and random forest (RF), to predict the three outcomes. We assessed model performance for each racial subgroup by the Area Under the Curve (AUC).

We identified 183,781 Veterans (23,332 Blacks and 160,449 Whites) aged 65 years or older. Mortality was 17.3% (Blacks) and 17.3% (Whites); hospitalization rates were 42.8% (Blacks) and 33.4% (Whites), and the prevalence of ICU use was 11.2% (Blacks) and 6.4% (Whites). The DT model had the worst performance. The standard LR achieved similar results as the RF model, especially for White Veterans. The performance of the standard LR could be slightly improved by adopting the sequential forward selection procedure. The frequent pattern mining procedure identified combinations of multiple risk factors and improved AUC for White Veterans. For example, by adding frequent patterns, the AUC for death obtained from LR with sequential forward selection improved from 0.706 to 0.747. Although age was identified as the most important risk factor for death and hospitalizations for Blacks and Whites, the importance of other identified risk factors varied by race and outcomes. For instance, the importance of COPD on mortality was greater for Blacks than Whites; smoking status was more important in predicting mortality for Whites than Blacks.

The predictive performance varied by ML models and race. The identified important risk factors varied by race and outcomes. Additional interactive effects across risk factors improved model performance. Future work is needed to understand what drives the racial differences in the relationships between risk factors and outcomes. Study findings are before the wide adoption of COVID vaccination and suggest that this analysis should be repeated following COVID vaccination.

When identifying risk factors for a clinical outcome, it may be important to examine them separately for populations of different races, as the clinical implication of these factors could be different across races.