The VA redesigned its health care system, the Veterans Health Administration (VHA) by including better use of information technology, measurement and reporting of performance, reorganization of health care delivery and realignment of payment policies. 2-3 However, the collection of data for quality measures through retrospective chart review is time consuming and expensive.1 Automating the abstraction process through the use of software would be beneficial to the VA. Data can be obtained more quickly and at a lower cost. Further, the use of such software for this purpose would be a step toward the wider use of automated text extraction in VA text documents. Informatics tools have been used to extract data from several document types in non-VA settings such as imaging reports, 4-5 discharge summaries, 6-7 and problem lists, 8 among others. Within the VA, informatics tools have been used to extract data with high accuracy from documents such as physical examinations for disability claims. 9 Another study used text processing software to detect central venous catheter adverse events using extracts of VA records. 1 However, a text extraction or natural language processing software has not been studied using VA discharge instructions to evaluate if performance measurement criteria are met. Because it is important to use an iterative process to refine the accuracy of the MCVS (i.e. to train the software) within the clinical context it will be used, 10 it is important to pilot the use the MCVS to extract data from VA discharge instructions.
Hypothesis: The use of an automated text-abstraction process to evaluate the completion of required elements within the document used at the PVAMC to record discharge instructions for inpatients with congestive heart failure will be as accurate as human abstractors.
Aim #1: To evaluate methods to compare the accuracy of the MCVSs and human abstractors ability to identify the presence of complete discharge instructions for inpatients at the PVAMC with congestive heart failure based on the External Peer Review Program (EPRP) required data elements.
Subaim #1: To evaluate accuracy assessment methods related to the MCVSs ability to extract data using a comparison of manual data abstraction based on clinician review using PVAMC discharge instruction documents
Subaim #2: To evaluate accuracy assessment methods related to the MCVSs ability to extract data using a comparison of External Peer Review Program (EPRP) data for CHF inpatient cases in 2003
Aim #2: To determine recommendations related to the use of the MCVS in PVAMC discharge instructions so that the use of the tool can be improved.
Subaim #1: To determine what contributes to accuracy of the MCVS
Subaim #2: To determine the barriers to accuracy of the MCVS
Subaim #3: To determine methods by which the MCVS can demonstrate improved performance.
C. Study Design and Approach:
This is a descriptive study that will quantify the test characteristics of the Multithreaded Clinical Vocabulary Servers (MCVS) ability to identify the presence of complete discharge instructions for 160 inpatients at the PVAMC discharged in 2003 with congestive heart failure in a training dataset. The criteria used to determine discharge instruction completion will be the same as the EPRP required elements. This study will provide a preliminary comparison of an existing manual method of data abstraction with a new method of automated data abstraction using the MCVS as well as pilot the MCVS so that it can be trained in the clinical context in which it will be used for further studies.
Study Population and Sample: EPRP Data: There were 160 inpatients with CHF discharged in 2003. These cases will be used as a training set to improve the MCVSs ability to detect what constitutes a completed discharge instruction document. The criteria used by the EPRP to determine the presence of a completed discharge instruction is listed in a subsection below. In a subsequent study we will evaluate the discharge instructions of 100% of the inpatients discharged from the PVAMC with CHF between the years of 2004-2007. The total number of cases available is 635 for which EPRP has been abstracted for the years of 2004-2007. Statistical Power for Subsequent Research to Test the MCVS: The proposed research will prepare the MCVS for the next study, the test phase. During the test phase the MCVS will need to be used with 625 discharge instructions for PVAMC inpatients discharged with CHF so that we have 95% confidence that we can tell the difference of 2% from the mean rate of documentation using a 2-sided alpha with 80% power. We will have adequate power using the 635 cases from the EPRP analysis for the subsequent test phase. If we consider all CHF discharges from 2003-2007 there are 815 cases. Because there are 160 inpatients with CHF discharged in 2003 the training set will represent 20% of the 815 cases from EPRP data for CHF inpatients from 2003-2007. After training the MCVS, the remaining 655 cases (80%) will be used as a test set in subsequent study that will evaluate the use of the MCVS in a larger set of discharge instruction documents. This research methodology has been establish in prior studies by Brown et al.1
Data Collection Preparation of Documents The discharge instructions for CHF inpatients will be identified for 2003. A research assistant will copy the discharge instructions for each case and paste them into a Word document. Each document will be de-identified so that all patient and provider identifiers are removed. The document will be saved with a pseudo-identifier on the secure VA CHERP server. Prior to using the MCVS the documents will be uploaded directly from the secure VA CHERP server to the secure VA TVHS server both of which are behind the VA firewall. The documents will then be available for use with the MCVS. The MCVS will be used via a VPN from Mayo Clinic to the TVHS server to extract required data elements. All records and data will remain behind the firewall on the TVHS server. This process was used in a prior VA study by Brown et al.1 Congestive Heart Failure Performance Measurement Criteria for Complete Discharge Instructions According to the EPRP technical manual each discharge instruction should contain the following 6 elements in the written instructions; Activity level; Diet; Discharge medications; Follow-up appointment with MD/NP/PA; Weight monitoring after discharge including documentation that patients should; weigh themselves daily, keep a record of their weight, be instructed about what weight change indicates a significant weight gain; and when to contact their health care provider if significant weight change occurs (for example, call provider if you gain >2-3 lbs overnight or call provider if you gain >3-5 lbs in the course of a week); and what to do if symptoms worsen.11
The EPRP data abstraction requires that if all elements of the discharge instructions are present, the measure is designated as being complete. This data is obtained by one abstractor with assistance from medical center personnel. For the purposes of our study we will determine the gold standard of presence of completed discharge instructions. Two trained chart abstracters will review the discharge instructions for presence of discharge instructions. If the two reviewers agree that all required instructions are present, the discharge instructions will be recorded as being complete. If there is disagreement between the two reviewers a third person will adjudicate to complete the gold standard. For example, if the third reviewer determines that the discharge instructions are incomplete, the final determination recorded in the database will be that the discharge instructions are incomplete. In contrast to the EPRP data, the reviewers in our study will only have access to the discharge instructions as opposed to information available to the EPRP reviewers who can access the entire medical record. We will limit the data available to our human abstractors because that will be the only document available to the MCVS for analysis.
Following the determination of the gold standard based on manual chart abstraction, the results of the human abstractors will be compared to the EPRP data for the same records. Tests of significance will be used to determine if there is a statistically significant difference between the results of the gold standard development and the EPRP data abstraction. In addition, when there are differences in the determination of complete or incomplete instructions, the medical record and the EPRP data sheet will be reviewed to determine the cause of the difference. This will help inform the limitations of the performance of the MCVS. For example, if in addition to information in the discharge instructions, EPRP reviewers may use information from other parts of the medical record, this data will not be available to the MCVS.
Process of Training the MCVS We will reformulate all 6 human-readable discharge instructions into a format suitable for computer implementation based on the methodology developed by Brown et al. 9 First, we will manually identify the concepts contained in each criterion and mapped them into SNOMED CT using a terminology browser. We will map concepts to either single SNOMED CT concepts or to explosions of SNOMED CT concepts. An exploded concept is linked to more specific subconcepts within the terminology. For example, the explosion of myocardial infarction includes anterior myocardial infarction, inferior myocardial infarction, lateral myocardial infarction, and several other subtypes of myocardial infarctions from the SNOMED CT hierarchies. We will represent concepts that could not be mapped to SNOMED CT using simple strings. Second, we will built complex rules by combining mapped concepts and unmapped strings with the Boolean operators and, or, and not. Finally, we will specify which section of the discharge instruction (e.g., activity level, diet, etc.) to which each rule was to be applied. This approach will allow the specification of computer-usable rules that can be applied to each examination via 3 steps. A standard process has been established to prepare the Multithreaded Clinical Vocabulary Server (MCVS) for use in a new dataset. 9 Using the method for our study the first step the MCVS separates discharge instructions into report sections (e.g., activity level, diet, etc.). In the second step, the MCVS indexes the document. Before indexing, words are normalized using a variation of the National Library of Medicines public domain software program NORM from the Unified Medical Language Systems knowledge source server, and then sentences are broken into single word and multiword phrases. The MCVS indexes the phrases from each examination using SNOMED CT; in so doing, it identifies separately phrases that indicate positive, negative, and uncertain assertions and constructs compositional expressions from combinations of simpler concepts (e.g., left foot is represented as the body part foot with laterality left). The MCVS has been extensively tested in Mayos usability laboratory and has been published in the medical literature. In the third step, a rules evaluation engine sequentially applies the rules expressed in health assessment language against the indices created for each examination report.
Multiple cycles of rule improvement will be conducted using the training set. In the first step of each cycle, the computer- usable rules will be applied to each discharge instruction included in the training set using concept-based indexing software. The results of the algorithmic approach will be compared to the gold standard of human expert review. The true-positive rate, false-positive rate, true-negative rate, false-negative rate, sensitivity, and specificity will be calculated for each quality rule. The second step of the rule improvement cycle will include a manual failure analysis of the false-positive and false-negative results. Rule modification based on the failure analysis will be the final step of the cycle. Mapped concepts or strings will be either added to or deleted from existing rules in an attempt to improve performance. When improvement cycles reach a preset level of accuracy (90% Sensitivity and 75% Specificity) using the training set, we will evaluate the resulting final rule set on the test set of discharge instructions.
Data Analysis Plan During development, the evolving rules will be applied iteratively to the training set (n=160) of discharge instructions. The results of the algorithmic review will be compared with the gold standard results generated by human expert review. Sensitivity, specificity, and percentage of agreement with the consensus gold standard will be generated for each rule. The results of the algorithmic review will also be compared to the EPRP data that has been selected. In all instances where there is a difference between the completion of the discharge instructions based on manual review or the EPRP data the reasons for the discrepancies will be examined. We will calculate inter-rater reliability and test for significance differences using the Kappa. These statistics will be calculated between the following: the text extractor and the EPRP data, the gold standard based on manual review and the text extractor, the gold standard based on manual review and the EPRP data.
The findings from the research at this time include an inter-rater reliability analysis for the development of the gold/reference standard, descriptive statistics of the presence of quality criterion in the documents analyzed, a preliminary summary of causes for the absence of criterion, and measures of accuracy of the MCVS in comparison to the gold/reference standard.
The inter-rater reliability based on the findings of the independent review by the nurse practitioners was undertaken. The Kappa for the summary criterion was 0.88 with 95% confidence interval 0.78 to 0.93. Based on the independent review there were 8 patient records with disagreement about overall completion of discharge instructions. The individual criteria in discharge instruction Kappa statistics ranged from 0.38 to 1.0.
Presence of CHF Discharge Instruction Quality Criterion Based on Gold/Reference Standard:
Based on the gold/reference standard criterion 1 was not present 3% of the time, criteria 2 was not present 0.06% of the time, criteria 3 was not present 1.25% of the time, criteria 4 was not present 6.9% of the time, criteria 5 was not present 36% of the time, criteria 6 was not present 34% of the time, and the overall percentage of incomplete discharge instructions (based on the completion of criteria 1-6) was 41%. Note that there may be evidence of meeting the criteria in sections other than the discharge instructions (such as in the nursing notes) but the reviewers only had access to the discharge instructions alone. Each of the records for which there were criteria missing are being reviewed and evaluated as part of our failure analysis process to accomplish Aim 2.
Preliminary Findings Related to Why Criterion Are not Present in the Documents:
At the medical center where this study was done a comprehensive brochure was developed by the nursing staff to educate patients with CHF. If this brochure is provided to patients and the distribution is documented in the medical record, the discharge instruction criteria are considered met by EPRP abstractors for criterion 3-6. This brochure is frequently provided to the patient during hospitalization or at the time of discharge by the nursing staff. In these circumstances, the documentation of the use of this brochure is located in the nursing notes or the nursing discharge summary. We found documentation of the use of the brochure in the nursing notes in 9 cases. We also found that in two cases two discharge instructions were created, one with updated information reflecting the required discharge instruction quality criterion but the research study staff selected the incomplete instructions for inclusion in the study documents instead of the complete ones. And finally, there were 10 patients who did not have the discharge diagnosis of CHF listed on the discharge instructions and we hypothesize that this is the reason they were not given instructions.
Comparison of MCVS and Gold/Reference Standard:
The MCVS was used to determine the presence or absence of the quality criterion in the documents. The percent agreement, sensitivity, specificity, and positive predictive value were calculated based on a comparison of the gold/reference standard and the MCVS findings. One hundred and fifty one of the 160 documents were processed by the MCVS. The MCVS achieved an overall percent agreement with the gold/reference standard for the bundle criteria of 89.4%, a sensitivity of 87%, a specificity of 86%, and a positive predictive value of 89%. Forty three rules were developed using SNOMED concepts and strings within parsed sections of the documents. All rules were provided by the MCVS team to the principal investigator of this research. There were 5 iterations of rule development which resulted in the final rule set. There were no new terms added to the MCVS vocabulary but the two strings "CHF Self-Management Plan" and "CHF Education Series" were added. Nine documents were not able to be processed by the MCVS. These documents could not be indexed because the xml special characters inside the document such as <, &, > and so on, interrupted the processing after they were parsed.
This study pilots the use of an informatics tool to automatically extract performance data for the high incidence, high impact disease of congestive heart failure. This study addresses critical gaps in effective use of electronic information to support clinical care and health services research as well as the efficiency of measuring performance and quality.
External Links for this Project
None at this time.