1064 — Effect of Baseline Timeframe on Potential Bias in Risk Factor Assessment in an EHR-Based Cohort
Vassy JL, VA Boston Healthcare System; Ho Y, VA Boston Healthcare System; Gagnon DR, VA Boston Healthcare System; Gaziano JM, VA Boston Healthcare System; Honerlaw J, VA Boston Healthcare System; Raju S, VA Boston Healthcare System; Wilson PW, VA Boston Healthcare System; Cho K, VA Boston Healthcare System;
Prospective cohort studies in classical epidemiology begin with a baseline exam when risk factors are measured, after which participants are observed over time for health outcomes. Longitudinal patient data from electronic health records (EHR) hold potential for large-scale health services research. However, they may necessitate flexibility in defining the timeframe of baseline risk factor assessment, since patients in a health system access care at varying frequencies. This variable contact may introduce information or selection bias to any study using such an EHR-defined cohort. In creating a cardiovascular disease (CVD) cohort study using VA patient data, we tested the hypothesis that widening the timeframe used to measure baseline CVD risk factors increases the yield of eligible patient data without introducing bias.
To define a Veteran cohort study, we used data from VISN 1 and VISN 7. We identified 589,361 eligible patients, defined as having valid demographic data and > = 1 set of blood lipid results between 2000-2007. We anchored the index date to the date of the first eligible lipid results. We then expanded the definition of the baseline timeframe by 1-week intervals before or after this date, assessing the proportion of eligible patients with blood pressure (BP) measurements with each successive widening of the baseline timeframe. We compared 3 mutually exclusive groups of patients: 1) those with BP from the exact index date, 2) those with BP not on the index date but within the VISN-specific 90th percentile to either side of the index date, and 3) those with no BP within the VISN-specific 90th percentile. We identified baseline CVD, diabetes, and mental health conditions from ICD-9 codes.
Group 1 contained 146,636 (61.0%) and 289,906 (83.1%) of the eligible patients in VISN 1 and 7, respectively. This proportion reached 90% within +91 or -154 days from the index date in VISN 1 and within only +7 or -14 days in VISN 7. For each VISN, the 3 groups did not differ substantially in BP or LDL-C levels, but Group 3 had fewer available race data, lower prevalence of baseline comorbidities, and fewer outpatient contacts with the health system.
Creating a prospective CVD cohort from VA data is feasible but may require special handling of patients with infrequent visits to minimize information and selection bias. The potential for bias may vary considerably across different VISNs.
In using longitudinal VA data for health services research, investigators should examine the potential for bias in their analyses introduced by variable contact with VA care.