2011 National Meeting

1031 — Automated Extraction of Ejection Fraction (EF) for Heart Failure (HF) from VA Echocardiogram Reports

Garvin JH (VA Salt Lake City HCS) , South BR (VA Salt Lake City HCS), Bolton D (VA Salt Lake City HCS), Shen S (VA Salt Lake City HCS), Duvall S (VA Salt Lake City HCS), Bray B (VA Salt Lake City HCS), Heidenreich P (VA Palo Alto HCS), Samore M (VA Salt Lake City HCS), Goldstein MK (VA Palo Alto HCS GRECC and COE-Palo Alto)

The aims of this research were to: build a natural language processing tool to extract concepts and associated values in the clinical domain of HF used for performance measurement from free text echocardiogram reports; validate the accuracy of extracted concepts and value of the (EF) as compared with a reference standard determined through manual review of the text; and, to generalize the methods developed in this research to automate data capture of the HF performance measures within the VA. This project was a translational use case project (TUCP) undertaken as part of the VA Consortium for Healthcare Informatics (CHIR).

Information extraction techniques using a heuristic approach were planned. An initial set of regular expressions and rules were iteratively developed to capture measurement of left ventricular function. A random sample of echocardiograms was obtained from seven VA medical centers. Two document sets were established from the sample, one serving as a training set by which the system was iteratively trained. The second was sequestered, and once the system was trained to a pre-specified level of accuracy, it was used as the test (validation) set. Two reviewers independently annotated selected documents; in situations where disagreements occurred, a third reviewer adjudicated disagreements to establish the final reference standard. This reference standard was then used to train and assess the performance of the automated system.

The prevalence-adjusted, bias-adjusted Kappa (PABAK) analysis revealed an agreement of 0.9932, 0.9715, and 0.9835 for the three pairs of reviewers who developed the reference standard. The system was developed and tested with test results of sensitivity (recall) of 98.41%, specificity of 100%, positive predictive value (precision) of 100%, and an F-measure of 0.992 in accurately classifying an EF of less than 40.

An automated information extraction system can be used to accurately extract performance measure criteria where a documented ejection fraction value of less than 40% is documented in VA echocardiograms.

This system can be used by VA stakeholders to abstract information for clinical studies and performance measurement more efficiently and at a lower cost than via a completely manual method.