The Veterans Health Information Systems and Technology Architecture (VistA) is an integrated system of software applications that directly supports patient care at Veterans Health Administration (VHA) healthcare facilities. To facilitate Veteran care, VistA maintains a massive repository of patient-related data, including over 1.3 billion textual documents (e.g., progress notes, discharge summaries). For Veterans with complex and chronic diseases, thousands or tens of thousands of text-based progress notes may be associated with their electronic health record (EHR). Being able to effectively search these medical records can help clinicians locate needed information easier and more quickly.
Traditional information retrieval (IR) systems, which are used to identify useful information from large text-based data repositories, select documents based on characteristics of the document collection being searched. IR systems can also make use of outside domain knowledge to augment the search process. For instance, synonymous terms can be located using a vocabulary and included automatically in a query. In the medical community, resources such as the Unified Medical Language System (UMLS) Metathesaurus and MEDLINE/PubMed records represent rich sources of domain knowledge. However, most IR systems use these resources in a localized way (such as previously mentioned for query expansion). This pilot study introduces a novel method of using domain-wide knowledge to facilitate IR. In particular, the relative importance of medical concepts are compared to one another, quantified, transformed into domain weights, and used in an IR system. These weights represent a way for the IR system to further discriminate documents for retrieval purposes within a clinically-specified topical area.
The objectives of this study were to (1) adapt existing IR systems to incorporate domain knowledge and (2) evaluate the IR systems developed in the first objective using a corpus of clinical documents from Veterans, where the majority had a positive molecular screening and culture for methicillin-resistant Staphylococcus aureus (MRSA).
Two experimental IR systems were compared against two baseline IR systems to explore the impact of including domain weights: (1) baseline vector space model (VSM) (Base_VSM), (2) domain-weighted VSM (Domain_VSM), (3) baseline latent semantic index (LSI) (BASE_LSI), and (4) domain-weighted LSI (Domain_LSI). Domain weights were determined by building a graph representing domain knowledge (such as found in vocabularies, ontologies, etc.), calculating the relative importance of concepts related to MRSA, and integrating those importance values into the IR systems. A graph was built using two sources of information: (1) hierarchical information of SNOMED-CT concepts from the UMLS Metathesaurus and (2) co-occurrence data of SNOMED-CT concepts from all 2011 English language journal articles with abstracts from MEDLINE. Personalized PageRank was calculated on all vertices of the graph to determine the relative importance of each concept to MRSA. The PageRank values were integrated into the IR systems by transforming the values linearly based on rank. Nine variations for transforming the values were tested, to determine the impact of putting more or less emphasis on the rank of the concepts.
The IR systems were tested against 14,734 medical notes generated from November 2007 to March 2009 from a sample of 25 Veterans. Twenty Veterans had a positive MRSA molecular screening and culture. The remaining five Veterans had a negative MRSA molecular screening and two or more negative MSRA cultures. Eleven queries, representing clinical information needs of clinicians with MRSA positive patients, were developed and evaluated for relevance against the document collection by two clinical co-investigators. The IR systems were compared using four measures: inferred average precision (infAP), binary preference (bpref), break-even point of precision and recall using only judged documents (R-precision(j)), and precision at 10 judged documents (P@10(j)).
Two clinical co-investigators judged 637 documents across 11 queries, of which 510 documents were assessed by both assessors. Cohen's Kappa was 0.68 (n=510). One query did not find any relevant documents and was dropped from subsequent assessment and evaluation. Among the various transformations applied to PageRank values, a 1.94% - 4.39% (Domain_VSM) and 7.61% - 12.07% (Domain_LSI) increase was seen between the lowest and highest performing weighting options averaged across the 10 queries for the four measures. Based on average performance across all ten queries (using the highest performing weighting variations), the IR systems performed in the following order (best to worse) across all four evaluation measures: Domain_VSM, Base_VSM, Domain_LSI, and Base_LSI. The average performance of the IR systems (listed in order of performance as shown before) was 0.72, 0.70, 0.65, 0.56 for infAP; 0.71, 0.70, 0.60, and 0.50 for bpref; 0.70, 0.68, 0.58, and 0.49 for R-precision(j); and 0.81, 0.73, 0.65, and 0.49 for P@10(j).
The VistA EHR system represents the cornerstone of clinical care in the VA. The goal of this stream of research is to make finding relevant information within a Veteran's electronic health record easier for clinicians; thus improving the process of care and potentially patient outcomes. The results of this pilot study demonstrate including domain knowledge does help improve retrieval results in the majority of queries examined. However, this work evaluated only a single clinical area with a limited number of queries. Future work is needed to validate these results in additional areas with larger number of queries.
- McCart JA, Finch DK, Jarman J, Hickling E, Lind JD, Richardson MR, Berndt DJ, Luther SL. Using ensemble models to classify the sentiment expressed in suicide notes. Biomedical informatics insights. 2012 Jan 30; 5(Suppl. 1):77-85.