Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

HSR&D Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Natural language processing accurately categorizes findings from colonoscopy and pathology reports.

Imler TD, Morea J, Kahi C, Imperiale TF. Natural language processing accurately categorizes findings from colonoscopy and pathology reports. Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association. 2013 Jun 1; 11(6):689-94.

Related HSR&D Project(s)

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions


BACKGROUND and AIMS: Little is known about the ability of natural language processing (NLP) to extract meaningful information from free-text gastroenterology reports for secondary use. METHODS: We randomly selected 500 linked colonoscopy and pathology reports from 10,798 nonsurveillance colonoscopies to train and test the NLP system. By using annotation by gastroenterologists as the reference standard, we assessed the accuracy of an open-source NLP engine that processed and extracted clinically relevant concepts. The primary outcome was the highest level of pathology. Secondary outcomes were location of the most advanced lesion, largest size of an adenoma removed, and number of adenomas removed. RESULTS: The NLP system identified the highest level of pathology with 98% accuracy, compared with triplicate annotation by gastroenterologists (the standard). Accuracy values for location, size, and number were 97%, 96%, and 84%, respectively. CONCLUSIONS: The NLP can extract specific meaningful concepts with 98% accuracy. It might be developed as a method to further quantify specific quality metrics.

Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.