The VA faces rising prevalence of post-traumatic stress disorder (PTSD) among its veterans from US conflicts in Iraq and Afghanistan. A tool that delivers a useable time-line of events will make it possible to query the patient-level database on the basis of temporal patterns of selected clusters of clinical or personal history events.
Project objectives: (1) develop a tool (MedTARSQI) to construct a medical chronology, tracking temporal patterns across events, episodes and trends of interest by mining the electronic health record; (2) implement it within the use-case of PTSD symptom and treatments occurrences over time.
Aim 1: Evaluated a) MedTARSQI-to-YTEX Integration & Term Mapping to Event Sub-domain: PTSD Model; b) integration of YTEX into MedTARSQI by comparing the output of MedTARSQI to a humanly annotated set of mappings from clinical classes to the time-theoretic classes of events. MedTARSQI was trained to map event terms to the PTSD-specific model, consisting of 15 symptom and treatment classes. MedTARSQI aligned these terms to 3 models: the time-theoretic model, the generalized clinical model, and the PTSD model. To evaluate MedTARSQI's performance we compared its output to human annotations of PTSD classes. // Aim 2: Evaluating Event Equivalence Detection; MedTARSQI was trained to merge multiple occurrences of events into a single event profile for each event detected by MedTARSQI. To evaluate MedTARSQI's ability to merge equivalent events, we compared its output to human classification of events pairs as co-referring or not co-referring. // Aim 3: Evaluating Temporal Relation Assignments. We have trained a probabilistic feature classifier to assign temporal relations (e.g. before, after) between pairs of PTSD events. To evaluate MedTARSQI's ability to correctly assign temporal relations to pairs of PTSD events, we compare the result of performing temporal closure over its output to sequences of the PTSD events as assigned by human annotators; // Aim 4: Evaluating a Patient Level Temporal Processing Engine: The PTSD Chronology; Testing whether integration of events extracted from multiple narrative documents with a selection of time-stamped structured data elements from the same time-frame as narrative documents (e.g., medications, consults, diagnoses) from the patient EHR results in improved temporal precision of the patient chronology.
To evaluate MedTARSQI's ability to correctly assign events sequences across multiple data elements within the EHR, we compared its output to human assignment of sequences of events across multiples of documents for a given patient, where the reviewers accessed narrative documents as well as a selection of structured data elements. We defined a set of relevant structured data elements for use in the structured/unstructured integration portion of the project. Reviewers made use of an ontology-creating tool to keep track of the elements they judged to be co-referential, and subsequently assign temporal relationships to identified elements. To assess whether the integration of time-stamped structured data elements with events extracted from multiple narrative documents improved the precision of MedTARSQI's sequence of events, we measured both the accuracy of the event ordering as well the proportion of events with temporally constrained boundaries before and after the integration.
As a performance improvement for Aim 1, we refactored the temporal expression module. Comparing GU-time, MedTARSQI modification of GU-time and Heidel-time in recognizing temporal expressions, classifying and assigning them date-time stamps in a set of de-identified discharge summaries against the gold standard, we obtained the following results:
1. GU-time: expression type F-measure: 0.13; expression value F-measure: 0.11
2. Modified GU-time: expression type F-measure: 0.83; expression value F-measure: 0.80
3. Heidel-time: expression type F-measure: 0.50; expression value F-measure: 0.40
We found that PTSD symptom and treatment events trivially mapped to 5 of the 7 time-theoretic events, an Aim 1 objective required for integrating the time-theoretic model with the clinical model.
The error analysis done in Aim 2 for detecting event coreference among PTSD symptom and treatment mentions within multiple documents of a given patient demonstrated a need for refactoring the feature set for event classification. The prior performance of 74.2 % agreement in comparison to human review rating candidate event pairs as co-referent or not co-referent, while acceptable for recognizing sameness of event, was not sufficiently robust to provide a temporal reasoning pipeline the needed precision for assigning temporal relations between events. This is because the central method for improving temporal relation assignment relies on exhaustively leveraging the contextual information across all co-referring events, and iteratively updating the temporal information with each event merged via confirmed co-reference among event pairings. We have incorporated an event coreference module within MedTARSQI, as opposed the external importation of YTEX results. This improved event coreference recognition to 89.3 % agreement as compared to human review within a corpus of 368 mental health notes from 58 PTSD patients.
The higher fidelity event coreference (Aim 2 result) has favorably impacted the Aim 3 goal of assigning temporal relations (e.g. before, after, during, etc.) between events detected in the corpus of mental health notes. Using features from each MedTARSQI module, and applying them to all detected events among patients having at least 3 consecutive mental health notes within a 12-month time frame, we assigned 1 temporal relation out of a possible set of 13 relations. We then propagated entailed temporal relations among all possible event pairs within sets of each patient note-set imposed by the assigned relations using the temporal closure algorithms in the final two modules of the pipeline. Using all detected events, not just those pertaining to PTSD, we were able to construct an average of 163 events, and 19 event chains per patient document set. The temporal relations between events within a chain and between event chains assigned by MedTARSQI were compared to temporal relations assigned by two human reviewers presented with sets of detected events within the selected mental health note corpus. MedTARSQI achieved a 71.9 % agreement rate with the human review. In the performance results with the updated MedTARSQI system, a larger proportion of the detected events occurred within co-reference chains (39% of the total detected events). After running the improved MedTARSQI augmented with the co-refence module, 51% of the detected events gained in temporal specificity. Further, we have added additional relationships that can be assigned between events: [simultaneous] and [near simultaneous].
The addition of these two relations ([simultaneous] and [near simultaneous]) were used to update the final module for deriving patient level event sequence, where the merging of events via simultaneity resulted in deriving a difference in either time-stamp, placement of an event within a sequence, or the duration of any of the events in the sequence. The integration of a selection of structured data elements with the events extracted from narrative documents resulted in a higher recall of total recognized PTSD events per patient-level document set. Recall prior to integration was 60%, post-integration, 67%. The greater number of eligible events within candidate sequences yielded a higher rate of agreement in temporal relation assignments to PTSD events, boosting F-measure to 77%. The proportion of events with temporally constrained boundaries did not increase significantly post-integration. We believe that this result can be improved by broadening the inclusion criteria of primary events extracted from the narrative notes so as to increase the pool of candidate structured events available for temporal or co-referential relationship assignment.
By leveraging language model reference standards developed for other projects (involving mental health, gastro-intestinal, cardiology and oncology) which were curated with temporal reasoning in mind, we have datasets across several clinical domains available for evaluation.
None at this time.
Mental, Cognitive and Behavioral Disorders
Epidemiology, Diagnosis, Technology Development and Assessment