*290

Return to 2001 Abstacts List

*290. The Unreliability of Single-Item Scales: Implications for Clinical and Health Services Research

K Wristers, Houston Center for Quality of Care and Utilization Studies; L Rabeneck, Houston Center for Quality of Care and Utilization Studies; KF Cook, VA Rehabilitation R&D Center; J Souchek, Houston Center for Quality of Care and Utilization Studies; TJ Menke, Houston Center for Quality of Care and Utilization Studies; NP Wray, Houston Center for Quality of Care and Utilization Studies

Objectives: Single-item self-report scales are routinely used in clinical and health services research despite researchers' claims that they are unreliable. Researchers have recently reported using single-item scales as primary outcome measures, secondary outcome measures, classification tools, validation tools, and covariates. Unreliability reduces estimated treatment effects, correlations among variables, and classification accuracy. This study compared the reliability of a single-item self-report dyspepsia pain scale to the reliability of a multi-item dyspepsia pain scale. In addition, this study investigated how a scale's reliability impacts outcome assessment (treatment success or failure).

Methods: We analyzed data from a double-blind, randomized clinical trial of 128 patients with dyspepsia (upper abdominal discomfort or pain thought to arise in the upper gut). We first evaluated whether the items from the single-item and multi-item scales measured a unidimensional construct, pain. We then estimated the reliability of the single-item dyspepsia pain scale and the 6-item SODA (Severity of Dyspepsia Assessment) Pain Intensity Scale using the communality estimates from a factor analysis of the items. The reliability estimates were then used to simulate data to evaluate the effect of reliability on treatment success and failure.

Results: The factor analyses provided evidence that the items measured a unidimensional construct. The reliability estimates for the single-item and multi-item pain scales were .48 and .95, respectively. We then compared the simulated scores of one patient to a predefined cut-score, a score indicating a need for treatment. The comparison indicated that the probability of making incorrect treatment decisions was 30% greater with the single-item scale than with the multi-item scale.

Conclusions: Single-item scales are not reliable tools for clinicians and health services researchers. The dyspepsia example and simulated data demonstrated how poor reliability of a single-item scale can cause significant errors in scores and how these errors lead to incorrect clinical judgements.

Impact: Clinicians and health services researchers should avoid using single-item scales. Even though single-item scales are easy to use and time-efficient, their unreliability makes them poor outcome measures. Instead, researchers should use multi-item scales. Multi-item scales provide greater reliability and a broader coverage of constructs.