2011 HSR&D National Meeting Abstract
1043 — Extraction and Validation of MRSA Data from the Nation’s VA Medical Centers
Jones MM (VA Salt Lake City Health Care System), Duvall SL
(VA Salt Lake City Health Care System), Spuhl J
(VA Salt Lake City Health Care System), Samore MH
(VA Salt Lake City Health Care System), Nielson C
(VA Reno Medical Center), Rubin MA
(VA Salt Lake City Health Care System)
Microbiology data are vital to epidemiologic studies of nosocomial infectious diseases. It is a priority to obtain data from multiple hospitals because infection control measures are ecological interventions-making hospitals the unit of analysis. We have developed a process to extract microbiology data from all VA hospitals.
VistA Remote Procedure Calls (RPC) were used to build a database of semi-structured, but free-text microbiology reports from all patients. Using natural language processing methods, we extracted Staphylococcus aureus and methicillin-resistance from the reports. Completeness was estimated by comparing document identifiers with available electronic data from 11 VA hospitals. Concordance with the text of the microbiology report, as viewed through VistaWeb, was measured by string comparisons of 142 records from across the country. We validated our extraction against independently derived, MUMPS-extracted electronic microbiology data from 10 hospitals between 1999 and 2006, and manual record annotation of 3,092 randomly sampled records throughout the VA. The set for manual annotation was enriched by requiring that some Staphylococcus-related token was mentioned in the report. Discordant records were manually re-reviewed and reassigned as appropriate. Sensitivity, specificity, and positive predictive value were reported for organism and susceptibility extractions.
We estimate that our extracted data are 98.5% complete by document identifiers. The RPC retrieved reports substantively matched 100% of reports seen on VistaWeb. Staphylococcus aureus was extracted with 98.9% sensitivity, 99.7% specificity, and 99.6% positive predictive value when compared against the manually annotated data set. Methicillin-resistance was extracted with 99.2% sensitivity, 99.4% specificity, and 97.9% positive predictive value when compared against the same set. Accuracy statistics were all higher when compared to the electronic set.
The application of informatics tools to assemble and extract data can deliver data useful for surveillance and epidemiologic study. The RPC-extracted data appear complete and concordant and are extracted with a high degree of accuracy.
We have described a system to securely gather microbiology data from the entire VA network of hospitals. The data allow for previously infeasible studies to investigate better modes of delivering care without nosocomial infection.