Health Services Research & Development

Go to the ORD website
Go to the QUERI website

HIR 10-001 – HSR&D Study

« New | Current | Completed | DRA | DRE | Portfolios/Projects | Centers | QUERI | Career Development Projects

HIR 10-001
Pro-WATCH: Epidemiology of Medically Unexplained Syndromes
Matthew H. Samore MD
VA Salt Lake City Health Care System, Salt Lake City, UT
Salt Lake City, UT
Funding Period: September 2010 - August 2014

BACKGROUND/RATIONALE:
More than 2 million members of the armed services have been deployed in Afghanistan or Iraq since 2001. Operation Enduring Freedom (OEF) and Operation Iraqi Freedom (OIF) veterans are experiencing a wide variety of health problems related to deployment. Although veterans of previous wars have experienced a variety of chronic, unexplained symptoms, relatively little is known about the prevalence of medically unexplained symptoms and syndromes (MUS) in OEF/OIF veterans.


OBJECTIVE(S):
The objectives of this study are to: (a) Use natural language processing (NLP) techniques to extract information about symptoms from Veterans Affairs (VA) ambulatory progress notes; (b) Validate an algorithm to detect the presence of a MUS, using responses to symptom questionnaires as the reference standard; and (c) Apply automated algorithms to national VA data to assess variation in prevalence of MUS by year, region, deployment exposure, blast injury, age, and co-morbid illness, including post-traumatic stress disorder (PTSD).


METHODS:
Tools were developed using NLP to support semi-automated ontology creation. Design and development included input from multiple ontology experts. Informatics ontologies for IBS, fibromyalgia, and chronic fatigue were created using the new tools.

A symptom tool was trained using a three-step process. Step one was annotation of randomly selected Veteran documents to establish a reference standard. A team of four reviewers was trained for the annotation task. Step two was using the reference standard to train a NLP pipeline to identify positive assertions of symptoms. To maximize recall and precision, different negation algorithms and machine learning methods were incorporated into the NLP pipeline and compared. Step three was to apply the NLP pipeline to the entire corpus.

Currently the instances of annotations are being human reviewed for validity. An additional 250 documents are being annotated to increase the instances of symptoms in the reference standard. Additional methods are being applied to the NLP pipeline to improve the symptom recognition.

An algorithm to induce the presence of MUS will be trained and validated using a reference standard consisting of Veteran responses to a symptom questionnaire and clinician reviewed records. A patient level record review tool is being built to assist in the clinician review process.

The validated algorithm will be used to analyze the epidemiology of MUS using all available electronic records in VINCI on OEF/OIF veterans. The prevalence of MUS will be reported including specific prevalence for chronic fatigue syndrome, irritable bowel syndrome, and fibromyalgia. Multivariable mixed effect Poisson regression models will be used to determine independent contributions of year, region, and duration of deployment.


FINDINGS/RESULTS:
Health data on 856,815 Veterans who had been deployed in Iraq or Afghanistan between October 2001 and October 2011 were analyzed. The corpus of text data for the OEF/OIF Veterans included 46 million clinical documents that belonged to the Text Integration Utility (TIU) note type.

A reference standard of positively asserted symptoms has been completed. 750 documents were reviewed using human annotation, 5,572 symptoms were identified. A lexicon of symptoms has been compiled using the symptoms identified by annotation. Within each document, subjective symptom expressions were compared to assertions of symptoms in clinical terms and to the assigned ICD-9-CM codes for the encounter. A total of 543 subjective symptom expressions were identified, of which 66.5% were categorized as mental/behavioral experiences and 33.5% somatic experiences. Machine learning for symptoms is complete. An ontology development tool has been created and pilot tested. Ontologies have been developed for IBS, Fibromyalgia, and Chronic Fatigue.

It has been determined that it is necessary to annotate an additional 250 documents to be used in the reference standard. The documents have just recently begun to be annotated by human annotators. The original 750 documents are also being re-reviewed for accuracy of the symptom findings. Once all annotation is completed, the algorithm will be tested again on the test set of documents.

IMPACT:
By examining symptoms and symptom clusters of post-deployed OEF/OIF Veterans, VA will have the ability to continually assess the health status and health care utilization of OEF/OIF Veterans. Specifically, VA will be able to measure the prevalence of MUS, identify symptoms that are related to combat-related exposures, and identify co-morbid conditions. By knowing this information, VA can provide more comprehensive care to the patient vs. reacting and treating individual symptoms.



PUBLICATIONS:

Journal Articles

  1. Jones B, Gundlapalli AV, Jones JP, Brown SM, Dean NC. Admission decisions and outcomes of community-acquired pneumonia in the homeless population: a review of 172 patients in an urban setting. American journal of public health. 2013 Dec 1; 103 Suppl 2:S289-93.
  2. Toth DJ, Gundlapalli AV, Schell WA, Bulmahn K, Walton TE, Woods CW, Coghill C, Gallegos F, Samore MH, Adler FR. Quantitative models of the dose-response and time course of inhalational anthrax in humans. PLoS pathogens. 2013 Aug 15; 9(8):e1003555.
  3. DeLisle S, Kim B, Deepak J, Siddiqui T, Gundlapalli A, Samore M, D'Avolio L. Using the electronic medical record to identify community-acquired pneumonia: toward a replicable automated strategy. PLoS ONE. 2013 Aug 13; 8(8):e70944.
Conference Presentations

  1. Meystre S, Samore MH. Domain and Application Ontologies for Medically Unexplained Syndromes. Paper presented at: American Medical Informatics Association Annual Symposium; 2012 Nov 3; Chicago, IL.
  2. Gundlapalli AV, Samore MH, Palmer M, Tuteja AK, Carter M, Shen S, South B, Forbush T, Divita G. Annotation of Symptoms in VA Clinical Documents. Poster session presented at: Integrating Data for Analysis, Anonymization, and Sharing Annual Conference; 2012 Sep 29; La Jolla, California.
  3. Samore MH, Nelson R. Screening for Homelessness in the Free Text of VA Clinical Documents using Natural Language Processing. Poster session presented at: VA HSR&D / QUERI National Meeting; 2012 Jul 16; National Harbor, MD.
  4. Forbush T, Gundlapalli AV, Palmer M, Shen S, South B, Divita G, Carter M, Redd AM, Butler J, Samore MH. Sitting on Pins and Needles. Paper presented at: American Medical Informatics Association Spring Congress; 2012 Mar 20; San Francisco, CA.
  5. South B, Palmer M, Shen S, Divita G, DuVall SL, Samore MH, Gundlapalli AV. Using Clinician Mental Models to Guide Annotation of Medically Unexplained Symptoms and Syndromes found in VA Clinical Documents. Paper presented at: International Society for Disease Surveillance Annual Conference; 2011 Dec 7; Park City, UT.
  6. Zeng Q, Samore MH, Divita G. Finding Medically Unexplained Symptoms within VA Clinical Documents using v3NLP. Poster session presented at: International Society for Disease Surveillance Annual Conference; 2011 Dec 7; Park City , UT.
  7. Palmer M, South B, Shen S, Tuteja AK, Divita G, Samore MH, Gundlapalli AV. Identification and Classification of Medically Unexplained Symptoms in VA Clinical Documents. Poster session presented at: VA HSR&D National Meeting; 2011 Feb 16; National Harbor, MD.
  8. South B, Palmer M, Shen S, Divita G, DuVall SL, Samore MH. Using Clinician Mental Models to Guide Annotation of Medically Unexplained Symptoms and Syndromes found in VA Clinical Documents. Poster session presented at: VA HSR&D National Meeting; 2011 Feb 16; National Harbor, MD.


DRA: Cardiovascular Disease, Military and Environmental Exposures, Autoimmunity and Allergy
DRE: Research Infrastructure, Epidemiology, Diagnosis
Keywords: Clinical Diagnosis and Screening, Healthcare Algorithms, Information Management, Knowledge Integration, Natural Language Processing, Reintegration Post-Deployment, Risk Factors, Surveillance
MeSH Terms: none