stdCoeff
will calculate fully standardised coefficients in
standard deviation units for linear, generalised linear, and mixed models.
It achieves this via adjusting the 'raw' model coefficients, so no
standardisation of input variables is required beforehand. Users can simply
specify the model with all variables in their original units and the
function will do the rest. However, the user is free to scale and/or centre
any input variables should they choose, which should not affect the outcome
of standardisation (provided any scaling is by standard deviations). This
may be desirable in some cases, such as to increase numerical stability
during model fitting when variables are on widely different scales.
If arguments cen.x
or cen.y
are TRUE
, model estimates
will be calculated as if all predictors (x) and/or the response variable
(y) were mean-centred prior to model-fitting. Thus, for an ordinary linear
model where centring of x and y is specified, the intercept will be zero -
the mean (or weighted mean) of y. In addition, if cen.x = TRUE
and
there are interacting terms in the model, all coefficients for lower order
terms of the interaction are adjusted using an expression which ensures
that each main effect or lower order term is estimated at the mean values
of the terms they interact with (zero in a 'centred' model) - typically
improving the interpretation of coefficients. The expression used comprises
a weighted sum of all the coefficients that contain the lower order term,
with the weight for the term itself being zero and those for 'containing'
terms being the product of the means of the other variables involved in
that term (i.e. those not in the lower order term itself). For example, for
a three-way interaction (x1 * x2 * x3), the expression for main effect
\(\beta1\) would be:
$$\beta_{1} + \beta_{12} \bar{x}_{2} + \beta_{13} \bar{x}_{3} +
\beta_{123} \bar{x}_{2} \bar{x}_{3}$$ (adapted from
here)
In addition, if std.x = TRUE
or unique.x = TRUE
(see below),
product terms for interactive effects will be recalculated using
mean-centred variables, to ensure that standard deviations and variance
inflation factors (VIF) for predictors are calculated correctly (the model
must be re-fit for this latter purpose, to recalculate the
variance-covariance matrix).
If std.x = TRUE
, coefficients are standardised by multiplying by the
standard deviations of predictor variables (or terms), while if std.y
= TRUE
they are divided by the standard deviation of the response. If the
model is a GLM, this latter is calculated using the link-transformed
response (or its estimate) generated using the function getY
. If
both arguments are true, the coefficients are regarded as 'fully'
standardised in the traditional sense, often referred to as 'betas'.
If unique.x = TRUE
(default), coefficients are adjusted for
multicollinearity among predictors by dividing by the square root of the
VIF's (Dudgeon 2016, Thompson et al. 2017). If they have also been
standardised by the standard deviations of x and y, this converts them to
semipartial correlations, i.e. the correlation between the unique
components of predictors (residualised on other predictors) and the
response variable. This measure of effect size is arguably much more
interpretable and useful than the traditional standardised coefficient, as
it is always estimated independent of other predictors and so can more
readily be compared both within and across models. Values range from zero
(no effect) to +/-1 (perfect relationship), rather than from zero to +/-
infinity (as in the case of betas) - putting them on the same scale as the
bivariate correlation between predictor and response. In the case of GLM's
however, the measure is analogous but not exactly equal to the semipartial
correlation, so its values may not be always be bound between +/-1 (such
cases are likely rare). Crucially, for ordinary linear models, the square
of the semipartial correlation equals the increase in R-squared when that
variable is added last in the model - directly linking the measure to model
fit and 'variance explained'. See
here
for additional arguments in favour of the use of semipartial correlations.
If r.squared = TRUE
, R-squared values are also returned via the
R2
function.
Finally, if weights
are specified, the function calculates a
weighted average of the standardised coefficients across models (Burnham &
Anderson 2002).