National Meeting 2007

3034 — Text Searches to Reduce Unknown Race in VistA Data

Halanych JH (Deep South Center of Effectiveness, Birmingham VAMC) , Safford MM (Deep South Center of Effectiveness, Birmingham VAMC)

Objectives:
Recent changes in VA race/ethnicity reporting have brought the categories in line with the US Census categories. Unfortunately, the race/ethnicity fields are missing in 61.1% of patients in the national datasets. Programs can be written to search text fields for pre-specified strings of characters. We developed and tested a text search algorithm of VistA to determine if we could reduce the number of patients with “unknown” race/ethnicity.

Methods:
As part of a diabetes existing study, we identified a cohort of patients who utilized VA services in FY00-05. We used a locally written MUMPS function to search the text fields of any patient encounter for 23 predefined character strings indicative of race/ethnicity, 12 for black patients (e.g., “YO AA”), 9 for white patients (e.g., “WM”), and 2 for Hispanics (e.g., “HISPAN”). The search algorithm was evaluated by structured chart review of: 1) 500 randomly selected records of patients with unknown race in the VistA race field; 2) all women and Hispanic patients; and 3) a randomly selected sample of 250 black and 250 white men. There were not enough American Indians and Asian patients in our VA for meaningful estimates.

Results:
Forty-six percent (6,345) of the 13,723 patients with diabetes had missing race in local VistA data. The text search algorithm assigned a race category to 68% of patients with unknown race and 99.4% of these were determined to be accurate by chart review. For those with known race, the text search algorithm demonstrated 85.8% overall agreement with VistA race field. Agreement was higher in white and black participants (100% and 88.1%, respectively) than for Hispanics (45.5%).

Implications:
A text search algorithm can be used to identify race in local VA VistA data. Low numbers of Hispanic, Native American, and Asian patients at our VA prevent reliable assessment for those racial/ethnic groups.

Impacts:
The high proportion of patients with missing data in the VA race fields hampers efforts to identify and reduce racial/ethnic disparities. We offer a system that can work at the local level and, if verified and expanded, could be implemented centrally in Austin to complete the VHA Medical SAS Datasets.