Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

Health Services Research & Development

Veterans Crisis Line Badge
Go to the ORD website
Go to the QUERI website

2012 HSR&D/QUERI National Conference Abstract

Printable View

2012 National Meeting

3058 — Screening for Homelessness in the Free Text of VA Clinical Documents Using Natural Language Processing

Carter MEPalmer MRedd AGinter TPickard SShen SSouth BNelson RSamore MH, and Gundlapallie AV, VA Salt Lake City HCS, University of Utah

We aim to develop natural language processing (NLP) algorithms using clinical narratives (unstructured data) to screen Veterans’ records for evidence of current or past homelessness. The objective is to support local and national policy, planning and allocation of resources for Veterans experiencing or at-risk of homelessness.

Cohort selection - From a corpus of 1.77 million VA clinical notes on Veterans seen in VHA facilities in 2009, a cohort of notes containing ‘homeless’ in note title and a control set of random notes were selected. Creating reference standard – using a written guideline, human reviewers classified notes as either: ‘confirmed homelessness,’ ‘possible/at risk of homelessness,' or ‘no evidence of homelessness.’ Inter-rater reliability (kappa) was calculated. Training NLP screening tool – using 2/3 of the reference standard corpus to train Automated Retrieval Console v2.0 (ARC), an NLP model for detecting ‘homelessness’ in clinical notes was developed. Structured elements such as ‘homeless’ in note title, clinic stop codes, and ICD-9 codes for homelessness were also used to identify homelessness among Veterans.

From a final cohort of 862 notes, human review found 45% of the documents contained evidence of ‘confirmed homelessness,’ 4% ‘possible/at risk of homelessness,’ and 51% ‘no evidence’ (weighted kappa of 0.79). The most successful ARC model from the training set resulted in 95.2% recall, 94.5% precision, and 94.8 F-measure when compared to the reference standard. Results from the test set of documents were 97.2% recall, 93.8% precision, and 95.5 F-measure. Structured elements found 25% of the note corpus with ‘homeless’ in note title, 12% with ICD9 codes for homelessness, and 9% with a clinic stop code affiliated with homeless services (31% had either one of these).

Our work shows the ability of an NLP system to screen for evidence of homelessness in the text of a clinical document. We are working to develop appropriate use cases for these algorithms.

Homelessness is a high priority area for VHA. Identifying homeless or at-risk Veterans has the potential to assist VA to target prevention efforts to those most in need and to better understand the issue of homelessness among the Veteran population as a whole.

Questions about the HSR&D website? Email the Web Team.

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.