Decision support, surveillance, and quality care initiatives have potential to increase the health of veterans by leveraging rich clinical data contained in the electronic health record. However, much of this data is locked in free-text reports, inaccessible to computerized applications. Natural language processing (NLP) tools can structure this free-text. In fact, many are already being developed and hosted on Veteran's Informatics and Computing Infrastructure (VINCI). Unfortunately, a large gap between development and clinical/research practice limits applicability - most NLP tools can require domain-specific customization, ensemble of multiple, incompatible tools, and external displays to aid interpretation.
A growing body of experience with interactive visualizations of clinical data provides multiple models of how abstracted information from clinical records might best be displayed to users, while extensive experience in fields such as end-user programming and user-centered design provides guidance on how some of the shortcomings in clinical NLP application might be addressed. We envision a flexible, user-centered development environment that will allow non-NLP experts, VA clinicians and medical researchers, to customize, investigate, and apply NLP to clinical tasks and to develop visual interfaces that integrate and display electronic health record (EHR) data using these methods.
We will develop an environment built upon a solid foundation of successful VA NLP components developed largely by members of this CREATE team, to realize this goal, applying user-centered design and evaluation techniques to address the gap between existing NLP to technology and the needs of end users.
Consistent with the entire informed CREATE, our approach is a user-centered approach in which the data needs of users are considered in the context of their clinical workflow. The development environment will be a general-purpose toolkit for researchers developing diverse types of displays for many different domains.
Specific Aim 1: Develop an environment for knowledge authoring that links user information extraction requirements with representation required for NLP.
Specific Aim 2: Develop and evaluate workbenches for NLP customization and document classification.
Specific Aim 3: Develop a visualization workbench for creating visualizations for clinical data.
To date, we are regularly meeting once a week to identify and have demoed several new tools that could potentially integrate well into the platform as well as create a charter to project milestones and track study progress over the next year.
Within the third year of this project, we have completed the following activities:
a) Tool Development: Advances to tool development, in particular with the following tools; Knowledge Author, Moonstone, VIP, pyConText, eHOST, and Chart Review.
b) User Studies: Conducting user studies to optimize the user interface of Knowledge Author and to evaluate pyConText for an information extraction task. We've conducted user studies for several domains: drug usage in depressed patients, site infections for surgical patients, critical findings in radiology cases, uncertainty sources in suspected pneumonia cases, social determinants of health in cardiac cases, exam qualities and findings from colonoscopy patients. (see references)
c) Evaluation Workbench: Currently the Evaluation Workbench is working, but is being updated for full integration into the IE-Viz data model.
d) Develop a multi-modal synonym generation method for recommending new lexical variants of concepts to end-users to improve vocabulary coverage for a given domain knowledge base: These methods include neural network (word2Vec), linguistic/rule-based, and vocabulary-based approaches. We developed an API to integrate these methods into the existing workbench.
A new generation of tools is needed for effective use and analysis of free-text clinical records. The VA has been a leader in developing NLP capability, with the hope of implementing NLP in system, patient, and team facing applications being developed in hi2 HMP and iEHR. We address the said gap between development and clinical/research practice limits applicability through development and evaluation of an Information Extraction-Visualization (IE-Viz) Toolkit.
- Trivedi G, Pham P, Chapman WW, Hwa R, Wiebe J, Hochheiser H. NLPReViz: an interactive tool for natural language processing on clinical text. Journal of the American Medical Informatics Association : JAMIA. 2018 Jan 1; 25(1):81-87.
- Sauer BC, Teng CC, Accortt NA, Burningham Z, Collier D, Trivedi M, Cannon GW. Models solely using claims-based administrative data are poor predictors of rheumatoid arthritis disease activity. Arthritis Research & Therapy. 2017 May 8; 19(1):86.
- Scuba W, Tharp M, Mowery D, Tseytlin E, Liu Y, Drews FA, Chapman WW. Knowledge Author: facilitating user-driven, domain content development to support clinical information extraction. Journal of biomedical semantics. 2016 Jun 23; 7(1):42.
- Velupillai S, Mowery DL, Abdelrahman S, Christensen L, Chapman WW. Towards a Generalizable Time Expression Model for Temporal Reasoning in Clinical Notes. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium. 2015 Nov 5; 2015:1252-9.
- Chapman WW. Information Extraction and Visualization Toolkit (IE-VIZ) Pipeline. 2017 Oct 1.
- Chapman WW. Moonstone. 2017 Jul 1.
- Chapman WW, Mowery DL. Web-based Evaluation Workbench. 2017 Jul 1.
- Weir CR, Samore MH, Chapman WW, Jones MM. Population health Dashboards. 2017 Jul 1.
- Chapman WW. Knowledge Author. 2017 May 15.
- Scuba W, Tharp M, Tseytlin E, Liu Y, Drews FA, Chapman WW. Knowledge Author: Creating Domain Content for NLP Information Extraction. Paper presented at: Semantic Mining in Biomedicine International Biennial Symposium; 2015 Oct 6; Aveira, Portugal.