1014 — Should Healthcare Systems Develop Their Own Risk Models for Patients?: Risk Prediction via Ensemble Learning and Other Data-Adaptive Methods
Kennedy EH, and Wiitala WL, Ann Arbor VA HSR&D Center for Clinical Management Research; Hayward RA, and Sussman JB, Ann Arbor VA HSR&D Center for Clinical Management Research and University of Michigan;
Risk prediction can play an essential role in personalized medicine, comparative effectiveness studies, and cost-effective decision-making. However, it is often implemented using externally-developed models (e.g., the Framingham risk score) or rigid methodology (e.g., logistic regression assuming linearity and testing few, if any, interactions). Advances in information technology and the electronic health record (EHR) may allow for radical improvements over such approaches. In this work we assess the performance of risk models that use: (1) internally-developed models; (2) flexible data-adaptive methodology; and (3) larger quantities of possibly less reliable data, when compared to traditional approaches.
With administrative data from the Veterans Health Administration (VHA), we predict five-year mortality from cerebrovascular or cardiovascular disease among 9,854 male Veterans treated across twelve VHA facilities, using six different methods: the Framingham risk score, logistic regression, and four data-adaptive methods (multivariate adaptive regression splines, generalized additive models, random forests, and boosting). We explore performance across subsets of data and use split-sample cross-validation to account for overfitting. We assess discrimination and calibration with the area under the ROC curve (AUC) and the Hosmer-Lemeshow goodness-of-fit test, respectively.
Cross-validated AUC increased by up to 5.7% (from 68.4% to 74.1%) when using internally-developed risk models instead of the Framingham risk score, despite using the same covariates. Including lab and medication data resulted in an even larger increase in AUC (8.7%). Data-adaptive methods gave increases in AUC of no more than 1.4% over standard logistic regression; however, the improvements appeared to be amplified with the availability of more data. Calibration was only problematic for one method (random forests) and for one subset of covariates.
Risk models that are developed internally or "recalibrated" can substantially improve predictive performance. Still greater performance can be attained by including more covariates that predict risk, even when using imperfect EHR data. Only modest improvements were seen in our data when using flexible data-adaptive methods, but such improvements may become more apparent with larger datasets.
Despite imperfect data quality, healthcare systems could achieve better risk prediction by using internally-developed models with EHR data and appropriate statistical methodology.