RRP 12-448 – HSR&D Study
Career Development Projects
Natural Language Processing to Ascertain Stress Test Data
Steven M. Bradley MD MPH
Rocky Mountain Regional VA Medical Center, Aurora, CO
July 2013 -
Stress testing is a critical component in the diagnostic evaluation and risk stratification of patients with ischemic heart disease. Presently, VA information systems used to measure and support high quality IHD care, including the Cardiac Care Clinical Follow-up Study (CCFCS) and CART, do not reliably include the clinical indication or results of stress tests in structured and interpretable format. Additionally, the VA Informatics and Computing Infrastructure (VINCI) lack tools to readily obtain this data. The absence of interpretable data on clinical indication and results of stress test results severely limits opportunities to measure and support quality of risk factor modification and the high-quality use of procedural care for patients with IHD.
This project sought to develop and validate a natural language processing (NLP) application to abstract structured data from text reports of stress tests performed in the VA.
We began with development of the annotation schema using sample stress test text documents from Veterans who underwent coronary angiography at any cardiac catherization laboratory in the VA between 2009 to 2011. This patient population was chosen to ensure an appropriate prevalence of stress tests with results positive for ischemia to develop the NLP application. It became apparent from these sample documents that: 1) the clinical indication of stress tests are not routinely reported in stress test reports; 2) the complexity of stress test reports mandated focused development of our NLP tool for one type of stress test (e.g. nuclear stress tests rather than nuclear, echo, and ECG stress tests); and 3) the annotation schema required complex approaches (e.g. assertion values, skip span annotation, back references) that have not been commonly employed within the same NLP tool. We opted to focus on nuclear stress tests given that 80% of stress tests performed in relation to management of IHD with cardiac catheterization are performed with nuclear imaging. Given these challenges, we were able to successfully complete development of the annotation schema and training of NLP annotators. However, we were unable to complete development of the NLP tool during the study period.
We gained an appreciation for the complexity and challenges of NLP tool development for semi-structured text documents such as nuclear stress tests. We will employ this understanding in future studies to develop NLP tools for ascertainment of stress test results in structured data formats.
Major goals of the IHD-QuERI include 1) leveraging data stored in new and existing information systems to improve the quality and safety of care for IHD patients and 2) improving cardiovascular risk factor management by integrating new programs into evolving systems of care. The NLP stress test project is integral to IHD-QuERI by developing the necessary skill sets to refine and establish new information systems for measurement and support of risk factor management and the optimal use of procedural care in Veterans with ischemic heart disease.
None at this time.