2012 HSR&D/QUERI National Conference Abstract
2027 — Tools for Exploring and Analyzing Text Data in Health Services Research and Epidemiology
Samore MH, Zeng QT, and DuVall S, VA Salt Lake City; D'Avolio L, VA Boston; Gundlapalli A, Divita G, and Nebeker J, VA Salt Lake City;
This workshop highlights the VA informatics research initiatives, Consortium for Healthcare Informatics Research (CHIR) and Veterans Informatics and Computing Infrastructure (VINCI).
The presentation will be divided into two parts. First, we will introduce and demonstrate products that enable VA researchers to work with text data in clinical notes. Second, we will give examples of the applications of these tools in health services research and epidemiology. The tools are designed for interactive use, encompassing individuals without prior experience in natural language processing. v3NLP is a tool that allows users to construct rules and deploy dictionaries to find text expressions or identify concepts. It facilitates a modular, standards-based approach to information extraction. Another tool, Automated Retrieval Console (ARCv2), is designed to extract coded concepts from text or to map documents into clinical categories, following a machine learning-based approach. The system relies on an iterative process of training and validation to generate classification algorithms, allowing the user to choose a variety of forms of text data to serve as inputs. The search tool, Voogle, makes it feasible for users to input terms or phrases into a text box in order to find clinical documents which contain particular items of information. The retrieved documents are sorted with respect to relevance. Voogle supports patient-level classification, combining structured data and text. The power of using these tools to analyze patient records will be highlighted, drawing from experiences associated with several VA research studies. These projects encompass a diverse array of challenges, including interpretation of chest radiology reports, identification of patients with specific types of exposures during deployment, and extraction of symptoms associated with medically-unexplained syndromes. Practical tips for working with text will be provided, within the broader context of the principles involved in scaling the ladder from data to information to knowledge.
The promise of natural language processing is that it will unlock vast amounts of information that are currently stored in text notes. However, it is crucial for investigators to also be aware of limitations and constraints associated with clinical documentation. The entire VA HSR&D research community can benefit from using these tools.
Assumed Audience Familiarity with Topic: