IIR 16-253 – HSR&D Study
Career Development Projects
Semi-parametric Statistical Methods for Predicting High-cost VA Patients Using High-Dimensional Covariates
Steven B. Zeliadt PhD MPH
VA Puget Sound Health Care System Seattle Division, Seattle, WA
May 2018 -
Rising demands and health care costs make it urgent to develop new statistical methods to accurately predict high-costs VA patients and important risk factors associated with high costs. The ability to prospectively predict high-costs patients is an important step toward controlling future health care costs. It is also important to identify disease areas that contribute significantly to high health care costs and other risk factors which policy makers can target by future intervention. Health care cost data are characterized by a high level of skewness and heteroscedastic variances. The large number of variables collected in the VA database provides rich information but imposes great challenges for statistical analysis and computation. In addition, the administrative and electronic medical record data from VA databases often contain missing data. The new statistical procedure we propose aims to take advantage of the rich databases in VA for analyzing costs data. It develops and employs state-of-art high-dimensional semiparametric statistical procedures to handle the complexity of VA data sets.
The objectives are to:
1. Predict high-costs patients by developing novel high-dimensional semiparametric procedure for prediction methods.
2. Identify important risk factors for high costs by developing novel high-dimensional sparse semiparametric variable selection procedures and new efficient algorithm.
3. Integrate the proposed High Costs Patient (HCP) system with the existing Care Assessment Needs (CAN) system via collaboration with Office of Analytics and Business Intelligence (OABI) and analyze costs data for patients receiving primary care within VHA.
To achieve the objectives, our research team will first develop the new semiparametric methods. We will implement an approach that will incorporate high-dimensional covariates and nonlinear covariate effects, as well as address the challenge of censoring by death to improve accuracy and increase the flexibility of modeling. We will also design new weighted semiparametric quantile regression based variable selection procedures which can simultaneously identify and estimate significant risk factors for high-dimensional data at the presence of missing values.
To test and implement these methods, we will develop a patient level dataset that combines all available cost data from the databases provided through the Decision Support System (DSS) National Extracts. We will link data from the Managerial Cost Accounting System (MCA, formerly Decision Support System or DSS) with three VA databases including: the VA Patient Treatment File (PTF); the VA Outpatient Clinic File (OCF); and the VA Beneficiary Identification and Records Locator Subsystem death file. We will compare the newly proposed methods with existing methods using both the VA data and simulated data.
Our proposed High Costs Prediction (HCP) system will improve care allocation by identifying patients who are at high-risk of incurring high costs within a subsequent one year period. Targeting care to these patients can reduce avoidable use of health care services and have a positive impact on reducing costs. The HCP system also allows us to identify disease areas that contribute significantly to high health care costs which policymakers can target by future intervention.
None at this time.
Treatment - Observational, Treatment - Comparative Effectiveness, TRL - Applied/Translational
Cost-Effectiveness, Organizational Planning, Statistical Methods