16. Comparing the Performance of Two Diagnosis-Based Case-Mix Measures Among Groups of Veterans Using Health Care Services
AK Rosen, Center for Health Quality, Outcomes, and Economic Research; S Loveland, Center for Health Quality, Outcomes, and Economic Research; JJ Anderson, Center for Health Quality, Outcomes, and Economic Research; C Rakovski, Center for Health Quality, Outcomes, and Economic Research; D Berlowitz, Center for Health Quality, Outcomes, and Economic Research
Objectives: Diagnosis-based casemix measures are commonly used to predict resource utilization in an entire population. Little work, however, has been done to explore how well these measures perform in subgroups of a population who may consume large amounts of resources or require high-cost care. This paper compares the ability of two leading casemix measures, Adjusted Clinical Groups (ACGs) and Diagnostic Cost Groups (DCGs), to explain concurrent utilization among veterans with low, medium, and high utilization.
Methods: A 40% random sample of all veterans who use health care services, excluding individuals with telephone or dental encounters only, was obtained from VA inpatient and outpatient databases. The resulting sample consisted of 1,046,803 veterans who received acute, long-term, or outpatient care during FY'97. An additional 20% random sample of veterans was selected for validation (n=524,461). We ran two weighted least squares regression models to explain concurrent FY'97 days of total utilization (inpatient + outpatient) using FY'97 diagnoses and demographics: 1) a reparameterized ACG model (32 ADGs, age, and gender) and 2) a reparameterized DCG model containing 136 Hierarchical Condition Categories (HCCs), age, and gender. We obtained R-squares, validated R-squares, cross-validated R-squares, and estimates of utilization (E) within categories of actual utilization (O) representing low (1-9 days), medium (10-29 days), and high utilization (30-365 days). Predictive ratios (PRs) were calculated within each subgroup to examine actual versus expected utilization. A PR for a group is the predicted utilization of a group divided by its actual utilization (E/O).
Results: The DCG/HCC model had better explanatory power than the ACG model in both the development and validation samples. Development sample R-squares were 0.315 and 0.231, respectively. Validated R-squares, almost identical to development R-squares, indicate the models' stability. Cross-validated R-squares were 0.314 for DCGs and 0.231 for ACGs. ACGs predicted utilization more accurately than DCGs within the subgroup containing veterans with the lowest utilization (15.5%); PRs were 1.5 and 2.8, respectively. DCGs had greater predictive accuracy than ACGs within all other subgroups, although both models overestimated the utilization of veterans with low utilization and underestimated the utilization of veterans with high utilization. PRs obtained from DCGs ranged from 2.8 for veterans with the lowest utilization to 0.23 for those veterans with the highest utilization.
Conclusions: Although reparameterized models did fairly well in explaining concurrent utilization, and performed similarly in both development and validated samples, neither model was accurate in explaining utilization among subgroups of the population. DCGs tended to predict more accurately than ACGs, but even with this model, high utilization, in particular, was underestimated. Inadequate prediction for veterans with high utilization is especially problematic in that it suggests that this group requires resources that would not be provided under a model-based resource allocation.
Impact: The VA is considering adaptation of risk adjustment methodologies for several applications, including provider profiling and resource allocation. These results suggest that modifications to existing methodologies may be necessary in order for the VA to allocate resources equitably across subgroups.