3058 — Screening for Homelessness in the Free Text of VA Clinical Documents Using Natural Language Processing
Carter ME, Palmer M, Redd A, Ginter T, Pickard S, Shen S, South B, Nelson R, Samore MH, and Gundlapallie AV, VA Salt Lake City HCS, University of Utah
We aim to develop natural language processing (NLP) algorithms using clinical narratives (unstructured data) to screen Veterans’ records for evidence of current or past homelessness. The objective is to support local and national policy, planning and allocation of resources for Veterans experiencing or at-risk of homelessness.
Cohort selection - From a corpus of 1.77 million VA clinical notes on Veterans seen in VHA facilities in 2009, a cohort of notes containing ‘homeless’ in note title and a control set of random notes were selected. Creating reference standard – using a written guideline, human reviewers classified notes as either: ‘confirmed homelessness,’ ‘possible/at risk of homelessness,' or ‘no evidence of homelessness.’ Inter-rater reliability (kappa) was calculated. Training NLP screening tool – using 2/3 of the reference standard corpus to train Automated Retrieval Console v2.0 (ARC), an NLP model for detecting ‘homelessness’ in clinical notes was developed. Structured elements such as ‘homeless’ in note title, clinic stop codes, and ICD-9 codes for homelessness were also used to identify homelessness among Veterans.
From a final cohort of 862 notes, human review found 45% of the documents contained evidence of ‘confirmed homelessness,’ 4% ‘possible/at risk of homelessness,’ and 51% ‘no evidence’ (weighted kappa of 0.79). The most successful ARC model from the training set resulted in 95.2% recall, 94.5% precision, and 94.8 F-measure when compared to the reference standard. Results from the test set of documents were 97.2% recall, 93.8% precision, and 95.5 F-measure. Structured elements found 25% of the note corpus with ‘homeless’ in note title, 12% with ICD9 codes for homelessness, and 9% with a clinic stop code affiliated with homeless services (31% had either one of these).
Our work shows the ability of an NLP system to screen for evidence of homelessness in the text of a clinical document. We are working to develop appropriate use cases for these algorithms.
Homelessness is a high priority area for VHA. Identifying homeless or at-risk Veterans has the potential to assist VA to target prevention efforts to those most in need and to better understand the issue of homelessness among the Veteran population as a whole.