HSR&D Citation Abstract
Search | Search by Center | Search by Source | Keywords in Title
Helfrich CD, Blevins D, Kelly PA, Gylys-Colwell IM, Dubbert PM. Using different intra-class correlations to assess inter-rater reliability + inter-rater agreement: Example of organizational readiness to change from three implementation studies. Paper presented at: National Institutes of Health Conference on the Science of Dissemination and Implementation: Research At The Crossroads; 2012 Mar 19; Bethesda, MD.
There is widespread interest in measuring subjective, organizational factors, such as culture and readiness to change. However, survey data of organizational constructs are often combined without assessing inter-rater reliability (IRR). Poor IRR can indicate problems with construct validity, scale reliability, or both. IRR should be assessed with two different intra-class correlations (ICCs): ICC(1) which is the proportion of variance attributable to the organization, and ICC(2) which is the reliability of the mean score. We illustrate each ICC and their interpretation using findings from a broader validation study of a previously developed survey, the Organizational Readiness to Change Assessment (ORCA).
We combined data from three implementation studies in the Veterans Health Administration testing external facilitation interventions to increase use of three different clinical practices. In each implementation study, two scales from the ORCA were fielded (Evidence and Context) among multiple individuals involved in implementation. We used hierarchical modeling to calculate ICC(1), and derived the ICC(2) using the Spearman-Brown equation.
For the Evidence scale, we had a total of 95 observations from 42 sites, and for the Context scale 105 observations from 41 sites. ICC(1)s for the aggregated data were .32 for Evidence and .27 for Context (i.e., 32% and 27% of the variance in Evidence and Context scores, respectively, were attributable to the organization). ICC(2)s were .52 and .48 for Evidence and Context (i.e., 52% and 48% consistency, respectively, in the mean scores). To achieve an ICC(2) of > = .80, based on the observed ICC(1)s, we estimate a minimum of 11 observations per site would be needed.
ICC(1)s exceeded conventional thresholds (typically .08 - .20), which supports the construct validity of the instrument as a measure of an organizational-level construct. However, ICC(2)s did not meet minimum thresholds (typically .70-.80) indicating we could not reliability estimate mean scores at the organizational level. Even with relatively high ICC(1)s, we required much larger numbers of respondents per site in order to obtain reliable site-level measures. This simple analysis is widely applicable to other organizational measures.
This research was supported by the VA Health Services Research and Development Service, grant IIR 09-067.