Stroke is a leading cause of death and disability in the US. The VA has recently conducted national efforts to reorganize and improve stroke care, including measurement of inpatient stroke quality indicators (QIs). To improve our ability to measure these important QIs efficiently and consistently, and to respond to national goals, the VA is increasing its use of electronic measurement of QIs. The Stroke QUERI is partnering with the Office of Informatics and Analytics to conduct some of this operational work. This project aims to enhance the accuracy of four specific stroke QIs that do not have adequate data elements represented in structured VA VistA data systems.
The Specific Aims of this project are to: 1) Use text analysis strategies to enhance CDW queries and generate the four electronic QIs, 2) Compare the electronic QIs to already completed chart reviews in 11 VAMCs, calculating sensitivity and specificity for each numerator and denominator, and 3) Generate facility reports for the stroke electronic QIs.
We used an existing dataset of 2130 ischemic stroke admissions at 11 VAMCs with completed chart review for all of the MU and VA stroke indicators (prior HSR&D QUERI SDP 09-158). We extracted CDW data for these admissions including inpatient and outpatient diagnoses, admission TIU notes, medications, and laboratory data. For the NIHSS indicator we explored use of SQL text string searching to identify instances of completion of the NIHSS within 24 hours of admission and compared this to chart review results. For the education indicator we examined existing health factors for stroke education at discharge and compared these to chart review. For the tPA indicator we developed an annotation scheme and trained annotators to identify mentions of stroke symptoms, symptom time, time of presentation, and thrombolysis decision in all notes signed within the first 18 hours of arrival plus the first Neurology note (if present). Stroke admissions (with their qualifying TIU notes) were stratified by chart review-based determination of presentation within two hours of symptom onset, and then randomly assigned to either a training or test set. Inter-annotator reliability was assessed throughout the project and maintained at > 85%. NLP tool development uses each note and identifies annotated stroke symptoms and their relationships to annotated times. Each occurrence of a stroke symptom will be used as an observation, with its corresponding assertion used for classifying the occurrence and its relationship to time. We will then compare time of onset information across notes occurring for the same stroke admission to develop rules for classifying the time of onset at the admission level. NLP output for each admission will then be incorporated into existing SQL algorithms for the tPA denominator statement (eligibility). For all the eCQMs we will compare the indicator denominators and numerators to the existing chart review data, calculating the proportion of matched cases and the sensitivity and specificity for the numerator and denominators of each indicator. We will then generate CDW indicator reports for the indicators, demonstrating feasibility in the current CDW data structure.
The NIHSS indicator has been completed, with 99.7% matching (sensitivity) in the denominator and 98.7% matching in the numerator. Specificity was likewise very high; 100% in the denominator and 97.1% in the numerator. Agreement for a three-level outcome at the admission level (ineligible, eligible-passed, eligible-failed) was very high with K 0.95 (95% CI 0.93-0.96).
For the education indicator, denominator matching (sensitivity) was high at 98.6% but numerator matching was low at only 41.1%. Numerator specificity was 98%, but overall agreement was only K 0.18 (p < 0.0001). Most numerator errors were false negatives due to inability to correctly identify all five types of required stroke education at discharge.
For the tPA indicator, a training set of 624 admissions (2700 notes) and the test set of 624 admissions (2741 notes) was annotated. The agreement for the training set ranged from 0.85 to 0.96 with an average agreement of 0.91 throughout the annotation process. NLP tool development in the training set is ongoing. We have built a vocabulary of terms for each annotation type. These terms are being used as features for conducting machine learning. Multiple rounds of machine learning will be performed and evaluated to optimize parameter values and determine the most accurate classification model for the data set. Matching (sensitivity) for receipt of tPA is 56% (14/25 received tPA per chart review), with the majority of errors resulting from lack of ED medication documentation in VA bar code administration systems. There were 18 cases (0.7% of admissions) where tPA was identified as a false positive receipt, these were cases where tPA was used for catheter flush or another non-stroke indication.
This project has generated a valid electronic measure of the NIHSS, one of the three VA IPEC stroke indicators. The project has also demonstrated that the education indicator is not likely to be sufficiently accurate using existing administrative data. If the NLP tool is sufficiently accurate, it will result in the development and validation of an electronic measure of the tPA indicator or, at the least, an ability to identify the small subset of stroke admissions likely to present early enough to be identified for further chart review.
This project has already provided important information to partners at OABI about the use of unstructured data for inpatient eCQMs, informing decisions about the methods of constructing these QIs.
None at this time.
Technology Development and Assessment