2015 National Conference

3105 — Development and validation of NLP software for extracting data on pulmonary function tests

Sauer BC, SLC IDEAS Center; Jones BE, SLC IDEAS Center; Leng J, SLC IDEAS Center; Lu C, SLC IDEAS Center; He T, SLC IDEAS Center; Teng C, SLC IDEAS Center; Zeng Q, SLC IDEAS Center;

To develop and validate a natural language processing (NLP) tool that extracts spirometric values and responses to bronchodilator administration from clinical documents, and to determine whether NLP would improve characterization of PFTs in VHA asthma patients.

We identified all patients at seven VA Medical Centers (VAMCs) within the intermountain region for FY2006-2012 with a diagnosis of asthma and evidence of a bronchodilator challenge, using current procedure terminology (CPT) codes and computer generated PFT with bronchodilator testing. We used 400 documents containing PFT information to develop NLP software to extract Forced Expiratory Volume in 1 second (FEV1), Forced Ventilatory Capacity (FVC), and significant response to bronchodilator challenge (BDC) from both report tables and physician interpretations. Significant BDC was defined as > 12% improvement"¦ We evaluated the NLP's accuracy versus a reference standard of two clinician reviewers in a separate set of 1001 documents. We then applied the NLP to the entire study population to estimate the number of additional BDCs obtainable by including NLP on clinical notes

In the validation set, NLP demonstrated 100% accuracy against physician review for extraction of baseline FEV1 and FVC values and post-bronchodilator FVC values, 99% accuracy for post-bronchodilator FEV1 values, and 99.1% accuracy for physician interpretation of response to a bronchodilator. For the 1,818 patients meeting inclusion criteria, application of the NLP increased the proportion of patients with complete bronchodilator challenge information by 25% (From 709 to 889). Nevertheless, the fraction of veterans in the cohort with complete BDCs who had evidence of receiving a BDC only increased from 32.2% to 39.6%

We developed a natural language processing tool that extracts PFT results in asthma patients with excellent accuracy. This technology can help improve measurement of PFTs in large populations of Veterans with respiratory disease for epidemiologic research, surveillance, or decision support. Additional work is required to generalize the NLP software beyond the VAMC centers studied.

Studies of pulmonary lung function in the VA should be carefully validated.