**Lead/Presenter:** Robert Lew, COIN - Bedford/Boston**All Authors:** *Lew RA (HSR&D COIN - Bedford/Boston) Miller C (HSR&D COIN - Bedford/Boston) Kim B (HSR&D COIN - Bedford/Boston) Stolzmann K (HSR&D COIN - Bedford/Boston) Wu H (HSR&D COIN - Bedford/Boston) Bauer MS (HSR&D COIN - Bedford/Boston)*

**Objectives:**Stepped wedge designs stagger implementation over time. Category choices greatly affect the balance of site characteristics over time. In a mental health intervention study, balancing to reduce time trends, we varied category definitions to perturb the balance.

**Methods:**The Behavioral Health Interdisciplinary Program trial balanced 9 VA Medical Center sites (3 sites assigned to 3 consecutive periods) for 10 site characteristics: AES Psychological Safety, AES Civility, Bedsize, %Rural Veterans, #Psychiatric Teams, #Patients, %utilization of primary care, #High risk patients, #Phone Encounters, and Geographic Region. Initially, we recoded numerical factors into tertiles (large, medium, small). For Bedsize, regarding the middle period as the fulcrum, a category such as "large" exactly balances if periods 1 and 3 have the same number of "large" bedsize sites. The exact number of beds could not be balanced. Imbalance connotes the moments of weight on a seesaw at various distances from the fulcrum. For the category, "large" bedsize, the imbalance score was the absolute difference of moments right and left of the fulcrum divided by the # of "large" sites. The sum over all categories was the overall imbalance score. The design for the sequence "ABCDEFGHI" assigns sites "ABC", "DEF", and "GHI" to periods 1, 2, and 3. Random permutations yielded 1680 distinct designs. Varying category definitions, we generated the 100 best balanced designs for each.

**Results:**Initially, the mean overall imbalance of the 34 best designs was 0.99 and 3.10 for all 1680 designs; a two-thirds reduction. This roughly held for each category. Varying category definitions, new issues emerged: assign factors difference weights, delete highly correlated factors, rescale numerical factors, and transform to reduce skewness.

**Implications:**Best-balanced designs greatly varied when features such as category definitions, factor weights, and scales changed. With a strategy to limit the possible variations, a program easily generates sets of 100 best-balanced and identifies the few designs least sensitive to such variations.

**Impacts:**Robust balancing of site-level characteristics controls time-trend and subjective bias in choosing categories.