Learn R Programming

misty (version 0.6.7)

center: Centering Predictor Variables in Single-Level and Multilevel Data

Description

This function centers predictor variables in single-level data, two-level data, and three-level data at the grand mean (CGM, i.e., grand mean centering) or within cluster (CWC, i.e., group mean centering).

Usage

center(..., data = NULL, cluster = NULL, type = c("CGM", "CWC"),
       cwc.mean = c("L2", "L3"), value = NULL, append = TRUE, name = ".c",
       as.na = NULL, check = TRUE)

Value

Returns a numeric vector or data frame with the same length or same number of rows as ... containing the centered variable(s).

Arguments

...

a numeric vector for centering a predictor variable, or a data frame for centering more than one predictor. Alternatively, an expression indicating the variable names in data e.g., center(x1, x2, data = dat). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.

data

a data frame when specifying one or more predictor variables in the argument .... Note that the argument is NULL when specifying a numeric vector or data frame for the argument ....

cluster

a character string indicating the name of the cluster variable in ... or data for two-level data, a character vector indicating the names of the cluster variables in ... for three-level data, or a vector or data frame representing the nested grouping structure (i.e., group or cluster variables). Alternatively, a character string or character vector indicating the variable name(s) of the cluster variable(s) in data. Note that the cluster variable at Level 3 come first in a three-level model, i.e., cluster = c("level3", "level2").

type

a character string indicating the type of centering, i.e., "CGM" for centering at the grand mean (i.e., grand mean centering, default when cluster = NULL) or "CWC" for centering within cluster (i.e., group mean centering, default when specifying the argument cluster).

cwc.mean

a character string indicating the type of centering of a level-1 predictor variable in a three-level model, i.e., L2 (default) for centering the predictor variable at the level-2 cluster means, and L2 for centering the predictor variable at the level-3 cluster means.

value

a numeric value for centering on a specific user-defined value. Note that this option is only available when specifying a single-level predictor variable, i.e., cluster = NULL.

append

logical: if TRUE (default), centered variable(s) are appended to the data frame specified in the argument data.

name

a character string or character vector indicating the names of the centered predictor variables. By default, centered predictor variables are named with the ending ".c" resulting in e.g. "x1.c" and "x2.c". Variable names can also be specified by using a character vector matching the number of variables specified in ... (e.g., name = c("center.x1", "center.x2")).

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to ... but not to cluster.

check

logical: if TRUE (default), argument specification is checked.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

Single-Level Data

Predictor variables in single-level data can only be centered at the grand mean (CGM) by specifying type = "CGM":

$$x_{i} - \bar{x}_{.}$$

where \(x_{i}\) is the predictor value of observation \(i\) and \(\bar{x}_{.}\) is the average \(x\) score. Note that predictor variables can be centered on any meaningful value specifying the argument value, e.g., a predictor variable centered at 5 by applying following formula:

$$x_{i} - \bar{x}_{.} + 5$$

resulting in a mean of the centered predictor variable of 5.

Two-Level Data

Level-1 (L1) predictor variables in two-level data can be centered at the grand mean (CGM) by specifying type = "CGM":

$$x_{ij} - \bar{x}_{..}$$

where \(x_{ij}\) is the predictor value of observation \(i\) in L2 cluster \(j\) and \(\bar{x}_{..}\) is the average \(x\) score.

L1 predictor variables are centered at the group mean (CWC) by specifying type = "CWC" (Default):

$$x_{ij} - \bar{x}_{.j}$$

where \(\bar{x_{.j}}\) is the average \(x\) score in cluster \(j\).

Level-2 (L1) predictor variables in two-level data can only be centered at the grand mean:

$$x_{.j} - \bar{x}_{..}$$

where \(x_{.j}\) is the predictor value of Level 2 cluster \(j\) and \(\bar{x}_{..}\) is the average Level-2 cluster score. Note that the cluster membership variable needs to be specified when centering a L2 predictor variable in two-level data. Otherwise the average \(x_{ij}\) individual score instead of the average \(x_{.j}\) cluster score is used to center the predictor variable.

Three-Level Data

Level-1 (L1) predictor variables in three-level data can be centered at the grand mean (CGM) by specifying type = "CGM":

$$x_{ijk} - \bar{x}_{...}$$

where \(x_{ijk}\) is the predictor value of observation \(i\) in Level-2 cluster \(j\) within Level-3 cluster \(k\) and \(\bar{x}_{...}\) is the average \(x\) score.

L1 predictor variables are centered within cluster (CWC) by specifying type = "CWC" (Default). However, L1 predictor variables can be either centered within Level-2 cluster (cwc.mean = "L2", Default, see Brincks et al., 2017):

$$x_{ijk} - \bar{x}_{.jk}$$

or within Level-3 cluster (cwc.mean = "L3", see Enders, 2013):

$$x_{ijk} - \bar{x}_{..k}$$

where \(\bar{x}_{.jk}\) is the average \(x\) score in Level-2 cluster \(j\) within Level-3 cluster \(k\) and \(\bar{x}_{..k}\) is the average \(x\) score in Level-3 cluster \(k\).

Level-2 (L2) predictor variables in three-level data can be centered at the grand mean (CGM) by specifying type = "CGM":

$$x_{.jk} - \bar{x}_{...}$$

where \(x_{.jk}\) is the predictor value of Level-2 cluster \(j\) within Level-3 cluster \(k\) and \(\bar{x}_{...}\) is the average Level-2 cluster score.

L2 predictor variables are centered within cluster (CWC) by specifying type = "CWC" (Default):

$$x_{.jk} - \bar{x}_{..k}$$

where \(\bar{x}_{..k}\) is the average \(x\) score in Level-3 cluster \(k\).

Level-3 (L3) predictor variables in three-level data can only be centered at the grand mean:

$$x_{..k} - \bar{x}_{...}$$

where \(x_{..k}\) is the predictor value of Level-3 cluster \(k\) and \(\bar{x}_{...}\) is the average Level-3 cluster score. Note that the cluster membership variable needs to be specified when centering a L3 predictor variable in three-level data.

References

Brincks, A. M., Enders, C. K., Llabre, M. M., Bulotsky-Shearer, R. J., Prado, G., & Feaster, D. J. (2017). Centering predictor variables in three-level contextual models. Multivariate Behavioral Research, 52(2), 149–163. https://doi.org/10.1080/00273171.2016.1256753

Chang, C.-N., & Kwok, O.-M. (2022) Partitioning Variance for a Within-Level Predictor in Multilevel Models. Structural Equation Modeling: A Multidisciplinary Journal. Advance online publication. https://doi.org/10.1080/10705511.2022.2051175

Enders, C. K. (2013). Centering predictors and contextual effects. In M. A. Scott, J. S. Simonoff, & B. D. Marx (Eds.), The Sage handbook of multilevel modeling (pp. 89-109). Sage. https://dx.doi.org/10.4135/9781446247600

Enders, C. K., & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods, 12, 121-138. https://doi.org/10.1037/1082-989X.12.2.121

Rights, J. D., Preacher, K. J., & Cole, D. A. (2020). The danger of conflating level-specific effects of control variables when primary interest lies in level-2 effects. British Journal of Mathematical & Statistical Psychology, 73, 194-211. https://doi.org/10.1111/bmsp.12194

Yaremych, H. E., Preacher, K. J., & Hedeker, D. (2021). Centering categorical predictors in multilevel models: Best practices and interpretation. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000434

See Also

coding, cluster.scores, rec, item.reverse, rwg.lindell, item.scores.

Examples

Run this code
#----------------------------------------------------------------------------
# Predictor Variables in Single-Level Data

# Example 1a: Center predictor 'disp' at the grand mean
center(mtcars$disp)

# Example 1b: Alternative specification using the 'data' argument
center(disp, data = mtcars)

# Example 2a: Center predictors 'disp' and 'hp' at the grand mean and append to 'mtcars'
cbind(mtcars, center(mtcars[, c("disp", "hp")]))

# Example 2b: Alternative specification using the 'data' argument
center(disp, hp, data = mtcars)

# Example 3: Center predictor 'disp' at the value 3
center(disp, data = mtcars, value = 3)

# Example 4: Center predictors 'disp' and 'hp' and label with the suffix ".v"
center(disp, hp, data = mtcars, name = ".v")

#----------------------------------------------------------------------------
# Predictor Variables in Two-Level Data

# Load data set "Demo.twolevel" in the lavaan package
data("Demo.twolevel", package = "lavaan")

# Example 5a: Center L1 predictor 'y1' within cluster
center(Demo.twolevel$y1, cluster = Demo.twolevel$cluster)

# Example 5b: Alternative specification using the 'data' argument
center(y1, data = Demo.twolevel, cluster = "cluster")

# Example 6: Center L2 predictor 'w2' at the grand mean
center(w1, data = Demo.twolevel, cluster = "cluster")

# Example 6: Center L1 predictor 'y1' within cluster and L2 predictor 'w1' at the grand mean
center(y1, w1, data = Demo.twolevel, cluster = "cluster")

#----------------------------------------------------------------------------
# Predictor Variables in Three-Level Data

# Create arbitrary three-level data
Demo.threelevel <- data.frame(Demo.twolevel, cluster2 = Demo.twolevel$cluster,
                                             cluster3 = rep(1:10, each = 250))

# Example 7a: Center L1 predictor 'y1' within L2 cluster
center(y1, data = Demo.threelevel, cluster = c("cluster3", "cluster2"))

# Example 7b: Center L1 predictor 'y1' within L3 cluster
center(y1, data = Demo.threelevel, cluster = c("cluster3", "cluster2"), cwc.mean = "L3")

# Example 7b: Center L1 predictor 'y1' within L2 cluster and L2 predictor 'w1' within L3 cluster
center(y1, w1, data = Demo.threelevel, cluster = c("cluster3", "cluster2"))

Run the code above in your browser using DataLab