1007 — Evaluation of Statistical Methods for Profiling Hospital Outcome Performance: The Case of Acute Myocardial Infarction 30-day Mortality in the VA
Hanchate AD, Stolzmann K, Burgess JF, and Pekoz E, COLMR, VA Boston; Christiansen CL, Boston University; Shokeen P, and Shwartz M, COLMR, VA Boston;
Despite increased use of hospital profiling, there is concern regarding the appropriateness of the statistical profiling method used. We examine two commonly used methods: traditional logistic regression leading to observed to expected ratios and the random effects (RE) hierarchical logistic regression method used in the VA/CMS HospitalCompare program. The two methods result in different profiles of VA hospitals when examining 30-day mortality for acute myocardial infarction (AMI) patients. There is no gold standard for identifying which profiling method "gets it right". To evaluate the two methods, we used simulated data for which the true hospital performance was predetermined. Implications of findings for VA hospital profiling are assessed.
The simulated data consist of randomly generated measures of patient risk, hospital performance, and patient dichotomous outcomes. We developed a series of simulated scenarios by incrementally varying hospital volumes and inter-hospital variation in performance (intra class correlation [ICC]). For each simulation we obtained standardized hospital risk-adjusted mortality rates (SMRs) using traditional and RE methods and classified hospitals into high, average, and low SMR categories. These were contrasted with the true SMR to calculate sensitivity, specificity, and positive predictive values (PPV +/-).
Sensitivity was higher using the traditional method, while PPV+ and specificity were higher using the RE method. At high ICC, both methods exhibited high and similar sensitivity and PPV+. As ICC decreased, the methods diverged, with the traditional method indicating higher sensitivity but lower PPV+ compared to the RE method. At ICC = 2%, the rate estimated for VA AMI data, highest sensitivity attained were: RE = 33% and traditional = 66%; and highest PPV+ were: RE = 72% and traditional = 43%. Decrease in hospital volumes increased this divergence between the methods.
At higher ICC and high hospital volumes, both methods exhibit high and similar performance in identifying high/low SMR. At lower ICC and hospital volumes, the traditional method exhibits higher sensitivity and the RE method higher PPV+.
To improve quality of hospital care we need valid methods for identifying the high and low performers. Our work provides a basis for selecting the best method for identifying high and low performers after consideration of tradeoffs between sensitivity and specificity.