stdEff
will calculate fully standardised effects
(coefficients) in standard deviation units for a fitted model or list of
models. It achieves this via adjusting the 'raw' model coefficients, so no
standardisation of input variables is required beforehand. Users can simply
specify the model with all variables in their original units and the
function will do the rest. However, the user is free to scale and/or centre
any input variables should they choose, which should not affect the outcome
of standardisation (provided any scaling is by standard deviations). This
may be desirable in some cases, such as to increase numerical stability
during model fitting when variables are on widely different scales.
If arguments cen.x
or cen.y
are TRUE
, effects will be
calculated as if all predictors (x) and/or the response variable (y) were
mean-centred prior to model-fitting (including any dummy variables arising
from categorical predictors). Thus, for an ordinary linear model where
centring of x and y is specified, the intercept will be zero - the mean (or
weighted mean) of y. In addition, if cen.x = TRUE
and there are
interacting terms in the model, all effects for lower order terms of the
interaction are adjusted using an expression which ensures that each main
effect or lower order term is estimated at the mean values of the terms
they interact with (zero in a 'centred' model) - typically improving the
interpretation of effects. The expression used comprises a weighted sum of
all the effects that contain the lower order term, with the weight for the
term itself being zero and those for 'containing' terms being the product
of the means of the other variables involved in that term (i.e. those not
in the lower order term itself). For example, for a three-way interaction
(x1 * x2 * x3), the expression for main effect \(\beta1\) would be:
$$\beta_{1} + \beta_{12} \bar{x}_{2} + \beta_{13} \bar{x}_{3} +
\beta_{123} \bar{x}_{2} \bar{x}_{3}$$ (adapted from
here)
In addition, if std.x = TRUE
or unique.x = TRUE
(see below),
product terms for interactive effects will be recalculated using
mean-centred variables, to ensure that standard deviations and variance
inflation factors (VIF) for predictors are calculated correctly (the model
must be re-fit for this latter purpose, to recalculate the
variance-covariance matrix).
If std.x = TRUE
, effects are scaled by multiplying by the standard
deviations of predictor variables (or terms), while if std.y = TRUE
they are divided by the standard deviation of the response variable (minus
any offsets). If the model is a GLM, this latter is calculated using the
link-transformed response (or an estimate of same) generated using the
function glt
. If both arguments are true, the effects are regarded
as 'fully' standardised in the traditional sense, often referred to as
'betas'.
If unique.x = TRUE
(default), effects are adjusted for
multicollinearity among predictors by dividing by the square root of the
VIFs (Dudgeon 2016, Thompson et al. 2017). If they have also been
scaled by the standard deviations of x and y, this converts them to
semipartial correlations, i.e. the correlation between the unique
components of predictors (residualised on other predictors) and the
response variable. This measure of effect size is arguably much more
interpretable and useful than the traditional standardised coefficient, as
it is always estimated independent of other predictors and so can more
readily be compared both within and across models. Values range from zero
to +/- one rather than +/- infinity (as in the case of betas) - putting
them on the same scale as the bivariate correlation between predictor and
response. In the case of GLMs however, the measure is analogous but not
exactly equal to the semipartial correlation, so its values may not always
be bound between +/- one (such cases are likely rare). Importantly, for
ordinary linear models, the square of the semipartial correlation equals
the increase in R-squared when that variable is added last in the model -
directly linking the measure to model fit and 'variance explained'. See
here
for additional arguments in favour of the use of semipartial correlations.
If refit.x
, cen.x
, and unique.x
are TRUE
and
there are interaction terms in the model, the model will be re-fit with any
(newly-)centred continuous predictors, in order to calculate correct VIFs
from the variance-covariance matrix. However, re-fitting may not be
necessary in some circumstances, for example where predictors have already
been mean-centred, and whose values will not subsequently be resampled
(e.g. parametric bootstrap). Setting refit.x = FALSE
in such cases
will save time, especially with larger/more complex models and/or bootstrap
runs.
If r.squared = TRUE
, model R-squared values are appended to effects
via the R2
function, with any additional arguments being passed via
...
.
If incl.raw = TRUE
, raw (unstandardised) effects can also be
appended, i.e. those with all centring and scaling options set to
FALSE
(though still adjusted for multicollinearity, where
applicable). These may be of interest in some cases, for example to compare
their bootstrapped distributions with those of standardised effects.
Finally, if weights
are specified, the function calculates a
weighted average of the standardised effects across models (Burnham &
Anderson 2002).