Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

IIR 16-253 – HSR Study

IIR 16-253
Semi-parametric Statistical Methods for Predicting High-cost VA Patients Using High-Dimensional Covariates
Steven B. Zeliadt, PhD MPH
VA Puget Sound Health Care System Seattle Division, Seattle, WA
Seattle, WA
Funding Period: May 2018 - April 2022
Rising demands and health care costs make it urgent to develop new statistical methods to accurately predict high-costs VA patients and important risk factors associated with high costs. The ability to prospectively predict high-costs patients is an important step toward controlling future health care costs. It is also important to identify disease areas that contribute significantly to high health care costs and other risk factors which policy makers can target by future intervention. Health care cost data are characterized by a high level of skewness and heteroscedastic variances. The large number of variables collected in the VA database provides rich information but imposes great challenges for statistical analysis and computation. In addition, the administrative and electronic medical record data from VA databases often contain missing data. The new statistical procedure we propose aims to take advantage of the rich databases in VA for analyzing costs data. It develops and employs state-of-art high-dimensional semiparametric statistical procedures to handle the complexity of VA data sets.

The objectives are to:
1. Predict high-costs patients by developing novel high-dimensional semiparametric procedure for prediction methods.
2. Identify important risk factors for high costs by developing novel high-dimensional sparse semiparametric variable selection procedures and new efficient algorithm.
3. Integrate the proposed High Costs Patient (HCP) system with the existing Care Assessment Needs (CAN) system via collaboration with Office of Analytics and Business Intelligence (OABI) and analyze costs data for patients receiving primary care within VHA.

To achieve the objectives, our research team will first develop the new semiparametric methods. We will implement an approach that will incorporate high-dimensional covariates and nonlinear covariate effects, as well as address the challenge of censoring by death to improve accuracy and increase the flexibility of modeling. We will also design new weighted semiparametric quantile regression based variable selection procedures which can simultaneously identify and estimate significant risk factors for high-dimensional data at the presence of missing values.
To test and implement these methods, we will develop a patient level dataset that combines all available cost data from the databases provided through the Decision Support System (DSS) National Extracts. We will link data from the Managerial Cost Accounting System (MCA, formerly Decision Support System or DSS) with three VA databases including: the VA Patient Treatment File (PTF); the VA Outpatient Clinic File (OCF); and the VA Beneficiary Identification and Records Locator Subsystem death file. We will compare the newly proposed methods with existing methods using both the VA data and simulated data.


Our proposed High Costs Prediction (HCP) system will improve care allocation by identifying patients who are at high-risk of incurring high costs within a subsequent one year period. Targeting care to these patients can reduce avoidable use of health care services and have a positive impact on reducing costs. The HCP system also allows us to identify disease areas that contribute significantly to high health care costs which policymakers can target by future intervention.

External Links for this Project

NIH Reporter

Grant Number: I01HX002310-01A1

Dimensions for VA

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

Learn more about Dimensions for VA.

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
    Search Dimensions for this project


None at this time.

DRA: Health Systems
DRE: TRL - Applied/Translational, Treatment - Observational, Treatment - Comparative Effectiveness
Keywords: Cost-Effectiveness, Organizational Planning, Statistical Methods
MeSH Terms: none

Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.