HIR 09-002
Consortium for Healthcare Informatics Research: Clinical Inference and Modeling
Stephen Lee Luther, PhD MA James A. Haley Veterans' Hospital, Tampa, FL Tampa, FL Funding Period: April 2009 - September 2013 |
BACKGROUND/RATIONALE:
The Consortium for Healthcare Informatics Research (CHIR) was a multi-disciplinary group of collaborating investigators affiliated with VA sites from across the US. The goal of the CHIR Clinical Inference and Modeling Project was to use machine learning (ML) techniques to augment natural language processing (NLP) and the interpretation of outputs from NLP programs in order to make clinical inferences. NLP and ML approaches are data intensive, requiring large numbers of cases and carefully labeled human annotated data. OBJECTIVE(S): The Objectives of this project were: 1. Improve extraction of clinically relevant information using machine learning approaches. 2. Use machine learning and epidemiologic methods to infer temporal relationships from unstructured and structured data. 3. Construct classification and predictive models from novel feature sets. METHODS: For Objective 1, clinical information was extracted from the VA electronic medical record system and annotated by two clinical experts with adjudication by a third to provide a reference standard for ML analyses. Open source software programs were developed to facilitate the implementation of ML in support of clinical inference. A series of experiments were conducted to evaluate the impact of feature set quality on statistical text mining, a commonly used ML technique. The first experiment considered the effects of task complexity on model performance by initially restricting the models to a single text feature and then iteratively adding features until there were 300. The second experiment considered the effects of changing the training set sample size. The third and most extensive experiment investigated the effects of target quality by randomly introducing labeling errors. The error rate was raised in increments of 1% until half the cases were mislabeled. In addition, the experiment was repeated for different sample sizes, as in the previous experiment. For Objective 2, traditional epidemiological methods were combined with NLP to better understand temporal relationships in medication regimens. NLP was conducted on prescription drug instructions (SIGs) and outpatient infusions for patients in the VA with rheumatoid arthritis (VARA) that are not in structured data. These are being incorporated into their Med History Estimator program. The NLP module was trained on 11,937 records. A validated set of the annotator-derived reference standard that contained 140 SIGs per medication and route was used to evaluate the NLP accuracy and the 95% Confidence Interval (CI) of accuracy was computed. For Objective 3, it was planned to conduct a variety of ML algorithms to construct heterogeneous models by combining existing structured data with information extracted from clinical notes and other sources of unstructured data. FINDINGS/RESULTS: Objective 1. An annotated data set of more than 1,000 progress notes for Veterans with PTSD was developed and posted on the VINCI for use by investigators involved with using machine learning techniques in clinical data. Lessons learned during this effort were used by the CHIR PTSD project to develop a subsequent national sample of documents. Multiple open source software products were developed to support machine learning analyses including: 1) Development of a new module based on the widely used open source negation program, NegEx, for use in the cTakes pipeline; 2) Addition of modules to perform statistical text mining analyses in the VA using RapidMiner; 3) Development of the Automated Retreival Council (ARC) for extraction of concept-level information extraction from progress notes by non-informatic investigators; 4) Development of TagLine, a machine learning-based strategy to improve NLP extraction of semi-structured data. Results of the experiments on data quality using statistical text mining (STM) found that models based on the best (most predictive) features stabilized at high performance levels with a few thousand documents (or even one thousand with the best features). In addition, the experiment was repeated for different sample sizes. The text mining models were found to be quite robust, even with small sample sizes. For instance, with a sample size of only 1000 cases, models performed fairly well with 30% error rates in the target. Objective 2. The overall accuracy for injectable biologic and oral and injectable non-biologic DMARDs was 89.1% (95% CI 87.9%-90.3%). The lower bounds of the 95% CI for most medications were greater than 80%. These results indicate that the NLP module can be used to extract information from SIGs to calculate the weekly DMARD dose. When applied to a national sample the use of NLP did not impact identification of VARA events. NLP data on the medication SIG did not impact estimates of course duration. NLP on infusion data improved estimates of course duration towards the VARA reference standard but improvements were not statistically different. Objective 3. The Clinical Inference and Modeling project investigators planned to use large heterogeneous data sets developed in the CHIR clinical projects to develop heterogeneous models. Unfortunately delays in software development to support the creation of these datasets made it impossible to use these data to complete the analyses. In lieu of these activities, Inference and Modeling investigators worked directly with the clinical projects to develop data sets for NLP analyses and applied ML techniques to improve NLP output. IMPACT: Open source software developed through the Clinical Inference and Modeling project have been made available for use by health services researchers, both in the VA and the private sector. Results of this project will allow for more effective construction of training sets, this is especially important when large scale annotation is needed. It will enhance our ability to extract and use temporal relations from processed text, which is crucial to clinical decision support and to construct clinically useful predictive models to enhance patient care. External Links for this ProjectDimensions for VADimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.Learn more about Dimensions for VA. VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address. Search Dimensions for this project PUBLICATIONS:Journal Articles
DRA:
Health Systems Science, Infectious Diseases
DRE: Research Infrastructure, Diagnosis, Technology Development and Assessment Keywords: none MeSH Terms: none |