The Consortium for Healthcare Informatics Research (CHIR) was a multi-disciplinary group of collaborating investigators affiliated with VA sites from across the US. The goal of the CHIR Clinical Inference and Modeling Project was to use machine learning (ML) techniques to augment natural language processing (NLP) and the interpretation of outputs from NLP programs in order to make clinical inferences. NLP and ML approaches are data intensive, requiring large numbers of cases and carefully labeled human annotated data.
The Objectives of this project were:
1. Improve extraction of clinically relevant information using machine learning approaches.
2. Use machine learning and epidemiologic methods to infer temporal relationships from unstructured and structured data.
3. Construct classification and predictive models from novel feature sets.
For Objective 1, clinical information was extracted from the VA electronic medical record system and annotated by two clinical experts with adjudication by a third to provide a reference standard for ML analyses. Open source software programs were developed to facilitate the implementation of ML in support of clinical inference. A series of experiments were conducted to evaluate the impact of feature set quality on statistical text mining, a commonly used ML technique. The first experiment considered the effects of task complexity on model performance by initially restricting the models to a single text feature and then iteratively adding features until there were 300. The second experiment considered the effects of changing the training set sample size. The third and most extensive experiment investigated the effects of target quality by randomly introducing labeling errors. The error rate was raised in increments of 1% until half the cases were mislabeled. In addition, the experiment was repeated for different sample sizes, as in the previous experiment. For Objective 2, traditional epidemiological methods were combined with NLP to better understand temporal relationships in medication regimens. NLP was conducted on prescription drug instructions (SIGs) and outpatient infusions for patients in the VA with rheumatoid arthritis (VARA) that are not in structured data. These are being incorporated into their Med History Estimator program. The NLP module was trained on 11,937 records. A validated set of the annotator-derived reference standard that contained 140 SIGs per medication and route was used to evaluate the NLP accuracy and the 95% Confidence Interval (CI) of accuracy was computed. For Objective 3, it was planned to conduct a variety of ML algorithms to construct heterogeneous models by combining existing structured data with information extracted from clinical notes and other sources of unstructured data.
Objective 1. An annotated data set of more than 1,000 progress notes for Veterans with PTSD was developed and posted on the VINCI for use by investigators involved with using machine learning techniques in clinical data. Lessons learned during this effort were used by the CHIR PTSD project to develop a subsequent national sample of documents. Multiple open source software products were developed to support machine learning analyses including: 1) Development of a new module based on the widely used open source negation program, NegEx, for use in the cTakes pipeline; 2) Addition of modules to perform statistical text mining analyses in the VA using RapidMiner; 3) Development of the Automated Retreival Council (ARC) for extraction of concept-level information extraction from progress notes by non-informatic investigators; 4) Development of TagLine, a machine learning-based strategy to improve NLP extraction of semi-structured data. Results of the experiments on data quality using statistical text mining (STM) found that models based on the best (most predictive) features stabilized at high performance levels with a few thousand documents (or even one thousand with the best features). In addition, the experiment was repeated for different sample sizes. The text mining models were found to be quite robust, even with small sample sizes. For instance, with a sample size of only 1000 cases, models performed fairly well with 30% error rates in the target. Objective 2. The overall accuracy for injectable biologic and oral and injectable non-biologic DMARDs was 89.1% (95% CI 87.9%-90.3%). The lower bounds of the 95% CI for most medications were greater than 80%. These results indicate that the NLP module can be used to extract information from SIGs to calculate the weekly DMARD dose. When applied to a national sample the use of NLP did not impact identification of VARA events. NLP data on the medication SIG did not impact estimates of course duration. NLP on infusion data improved estimates of course duration towards the VARA reference standard but improvements were not statistically different. Objective 3. The Clinical Inference and Modeling project investigators planned to use large heterogeneous data sets developed in the CHIR clinical projects to develop heterogeneous models. Unfortunately delays in software development to support the creation of these datasets made it impossible to use these data to complete the analyses. In lieu of these activities, Inference and Modeling investigators worked directly with the clinical projects to develop data sets for NLP analyses and applied ML techniques to improve NLP output.
Open source software developed through the Clinical Inference and Modeling project have been made available for use by health services researchers, both in the VA and the private sector. Results of this project will allow for more effective construction of training sets, this is especially important when large scale annotation is needed. It will enhance our ability to extract and use temporal relations from processed text, which is crucial to clinical decision support and to construct clinically useful predictive models to enhance patient care.
External Links for this Project
- Konovalov S, Scotch M, Post L, Brandt C. Biomedical informatics techniques for processing and analyzing web blogs of military service members. Journal of medical Internet research. 2010 Oct 5; 12(4):e45. [view]
- McCart JA, Berndt DJ, Jarman J, Finch DK, Luther SL. Finding falls in ambulatory care clinical documents using statistical text mining. Journal of the American Medical Informatics Association : JAMIA. 2013 Sep 1; 20(5):906-14. [view]
- LaFleur J, Nelson RE, Sauer BC, Nebeker JR. Overestimation of the effects of adherence on outcomes: a case study in healthy user bias and hypertension. Heart (British Cardiac Society). 2011 Nov 1; 97(22):1862-9. [view]
- Garla V, Lo Re V, Dorey-Stein Z, Kidwai F, Scotch M, Womack J, Justice A, Brandt C. The Yale cTAKES extensions for document classification: architecture and application. Journal of the American Medical Informatics Association : JAMIA. 2011 Sep 1; 18(5):614-20. [view]
- McCart JA, Finch DK, Jarman J, Hickling E, Lind JD, Richardson MR, Berndt DJ, Luther SL. Using ensemble models to classify the sentiment expressed in suicide notes. Biomedical informatics insights. 2012 Jan 30; 5(Suppl. 1):77-85. [view]
- Berndt DJ, McCart JA, Luther SL. Using ontology network structure in text mining. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2010 Nov 13; 2010:41-5. [view]
- Luther S, Berndt D, Finch D, Richardson M, Hickling E, Hickam D. Using statistical text mining to supplement the development of an ontology. Journal of Biomedical Informatics. 2011 Dec 1; 44 Suppl 1:S86-93. [view]
- Lo Re V, Lim JK, Goetz MB, Tate J, Bathulapalli H, Klein MB, Rimland D, Rodriguez-Barradas MC, Butt AA, Gibert CL, Brown ST, Kidwai F, Brandt C, Dorey-Stein Z, Reddy KR, Justice AC. Validity of diagnostic codes and liver-related laboratory abnormalities to identify hepatic decompensation events in the Veterans Aging Cohort Study. Pharmacoepidemiology and drug safety. 2011 Jul 1; 20(7):689-99. [view]
- McCart J, Jarman J, Finch D, Luther SL. An Introductory Look at Statistical Text Mining for Health Services Researchers. Presented at: VA HSR&D National Meeting; 2011 Feb 16; Washington, DC. [view]
- Jarman J, McCart J, Luther SL, Berndt DJ. Automated Rule Development Using Text Mining. Poster session presented at: American Medical Informatics Association Annual Symposium; 2012 Nov 3; Chicago, IL. [view]
- Jarman J, Luther SL, McCart J, Berndt DJ. Combining Natural Language Processing and Statistical Text Mining: Classifying Fall-Related Progress Notes. Presented at: VA HSR&D National Meeting; 2011 Feb 16; Washington, DC. [view]
- Finch D, Berndt DJ, Luther SL. Extracting Semi-Structured Text Elements in Medical Progress Notes: A Machine Learning Approach. Poster session presented at: American Medical Informatics Association Annual Symposium; 2012 Nov 3; Chicago, IL. [view]
- Finch D, McCart J, Luther SL. TagLine: Information Extraction for Semi-Structured Text in Medical Progress Notes. Presented at: American Medical Informatics Association Annual Symposium; 2014 Nov 15; Washington, DC. [view]
- Berndt DJ, Finch D, Foulis P, Luther SL. The Impact of Data and Target Quality in Text Mining Clinical Notes. Poster session presented at: American Medical Informatics Association Annual Symposium; 2010 Nov 13; Washington, DC. [view]
- McCart J, Jarman J, Matheny ML. TrANEMap: A Fast Tree-Based Named Entity Recognition Engine. Poster session presented at: American Medical Informatics Association Annual Symposium; 2011 Oct 22; Washington, DC. [view]
- McCart J, Berndt DJ, Finch D, Jarman J, Luther SL. Using Statistical Text Mining to Identify Fall-related Injuries in VHA Ambulatory Care Data. Poster session presented at: American Medical Informatics Association Annual Symposium; 2012 Nov 3; Chicago, IL. [view]
Health Systems, Infectious Diseases
Diagnosis, Technology Development and Assessment, Research Infrastructure