Valid and reliable measures of health care quality are critical to the success of performance-based payment systems. Providers are typically evaluated on the basis of summary measures of quality, which aggregate information from multiple measures of quality. Existing summary measures differ along a number of important dimensions, however, including the criteria used to weight individual quality measures, the degree to which performance is adjusted for patients' severity of illness, and whether or not patients' preferences for the outcomes of poor quality are incorporated. Despite these differences, no studies have systematically evaluated the strengths or limitations of alternative summary measures or compared the consistency of their quality ratings.
Our project has three main objectives. First, we describe the statistical methods used to derive four summary measures of quality, and then critique each method's strengths and limitations. Second, we use the four approaches to estimate the quality of diabetes care provided by hospitals and clinics in VISN 11, and then compare and contrast quality ratings across methods. Third, we evaluate the statistical properties of each summary measure including case mix bias and reliability.
We derive summary estimates of the quality of care for diabetes using four distinct approaches to aggregate five dichotomous quality measures. First, we estimate a raw mean score for each provider, which represents the proportion of successes across the five measures averaged over all patients. Second, we estimate an "all-or-nothing" mean score, which is defined as the proportion of patients for whom all 5 measures were scored as a success. Third, we fit an item-response theory model to derive a latent quality trait for each provider, which represents a summary index of quality. The fourth method quantifies and aggregates the effects of failures on each measure on long term health outcomes, and uses an existing Markov model of diabetes progression to simulate the lost quality-adjusted life expectancy associated with quality failures. After fitting the four models we assess differences in providers classified as outliers and differences in the classification of providers in the upper and lower quartiles of the provider distribution. We examine case mix bias by risk-adjusting individual measures that are sensitive to case mix, and comparing the results to the unadjusted results. Finally, we compare reliability of each summary measure using a combination of standard formulae and variance estimates for the between-provider distribution.
No results at this time.
VHA has adopted pay-for-performance as a core strategy to improve the quality of care within the VA system. Veterans healthcare is likely to benefit from advances in quality measurement, particularly if summary measures of quality (which form the basis of incentive payments) are constructed in such a way that prioritizes individual measures that are most highly correlated with better long term health outcomes. Providers would then be more likely to focus quality improvement efforts in these areas. We expect VHA's Office of Quality and Performance to be interested in the results of our critique, and might use our results to inform their current quality measurement initiatives.
None at this time.
Quality assessment, Quality Measure, Research measure, Research method