Approximately 375,000 surgical procedures per year are performed in the VHA, most being monitored by VA Surgical Quality Improvement Program (VASQIP, formerly NSQIP). A study using VASQIP data from FY-04 found 3.1% 30-day mortality, 10.7% at one year, and 43% at five years. While the VASQIP was initially successful in reducing peri-operative morbidity and mortality, further improvement has slowed. To continue improvement, new research must be pursued in two directions, exploiting new sources data and applying improved analytic methods to identify subgroups at risk for adverse outcomes.
The objectives of this study are to a) identify novel predictors of adverse outcomes with a focus on physiologic dynamics, b) implement and evaluate Atul Gawande's "Surgical Apgar" risk score using VASQIP data, c) compare the predictive capabilities of models incorporating new measures and the Surgical Apgar to the currently used VASQIP models, d) explore additional modeling and classification methods, e) assess whether these additional methods and measures significantly improve predictive performance.
The VASQIP data has been collecting comprehensive pre-operative and outcomes data on a substantial sample of surgeries in VA nationwide since 1994. An ongoing study (VA HSR&D IIR 05-229, PI Dr. John Bian-- formerly Dr. Terri Monk) collected and standardized AIMS data from multiple VA facilities and merged it with VASQIP data for approximately 30,000 surgeries. Novel summary measures of the AIMS data were identified, focusing on physiologic dynamics and were incorporated into standard predictive models. Additional prediction methods have been applied including classification and regression trees (CART), hybrid CART/logistic regression, random forests, support vector machines, and boosting. Last, We have calculated the surgical Apgar score (SAS) for our analytical sample using intraoperative heart rate and blood pressure data and a published algorithm to calculate estimated blood loss from VASQIP data and hematoctrit values. Using 1000 iterations of split-sample validation we have compared the performance of the SAS to VASQIP-based models and to VASQIP models augmented with the individual components of the SAS using c-statistics overall and stratified by surgical specialty. Learning curve analyses have compared the performance of logistic regression models to CART models and random forests and have not found a substantial improvement with the machine learning methods in this data set.
The performance of the surgical Apgar and the augmented Apgar methods varies widely by specialty, with the best performance for GI surgery. Overall and in stratified analyses, the VASQIP-based and augemented VASQIP-based models performed significantly better than the surgical Apgar. Learning curve analyses comparing logistic regression methods to CART and hybrid CART/Regression methods suggest that CART and random forests do not offer substantial improvement over standard methods for predicting adverse outcomes with this data.
Improved risk modeling and prediction can ultimately result in more accurate risk stratification and further reducing the rates of post-operative morbidity and mortality within the VA, thereby positively impacting Veterans' health. Findings from the surgical Apgar analysis suggest that there are better tools available in the VA for risk stratification and that simple scores 'tuned' for each specialty may perform better than an overall surgical Apgar. Analyses with machine-learning algorithms show little advantage over logistic regression models for the data in this study. There may be an advantage with VA-wide intra-operative data.
None at this time.
Technology Development and Assessment, Treatment - Comparative Effectiveness