getY
will return the response variable from a model by
summing the fitted values and the response residuals. If link = TRUE
and the model is a GLM, the response is transformed using the model link
function. However, if this transformation results in undefined values, it
is replaced by an estimate based on the 'working' response variable of the
GLM (see below). The function can also be used to transform a variable
(supplied to mod
) using the link function from the specified
family
- in which case the link
argument is ignored.
Estimating the link-transformed response
A key challenge in generating fully standardised model coefficients for a
generalised linear model (GLM) with a non-gaussian link function is the
difficulty in calculating appropriate standardised ranges (typically the
standard deviation) for the response variable in the link scale. This is
because directly transforming the response will often produce undefined
values. Although methods for circumventing this issue by indirectly
estimating the variance of the link-transformed response have been proposed
- including a latent-theoretic approach for binomial models (McKelvey &
Zavoina 1975) and a more general variance-based method using a pseudo
R-squared (Menard 2011) - here an alternative approach is used. Where
transformed values are undefined, the function will instead return the
synthetic 'working' response from the iteratively reweighted least squares
(IRLS) algorithm of the GLM (McCullagh & Nelder 1989). This is
reconstructed as the sum of the linear predictor and the working residuals
- with the latter comprising the error of the model in the link scale. The
advantage of this approach is that a relatively straightforward
'transformation' of any non-gaussian response is readily attainable in all
cases. The standard deviation (or other relevant range) can then be
calculated using values of the transformed response and used to scale the
coefficients. An additional benefit for piecewise SEM's is that the
transformed rather than original response can then be specified as a
predictor in other models, ensuring that standardised indirect and total
effects are calculated correctly (i.e. using the same units for the
variable).
To ensure a high level of 'accuracy' in the working response - in the sense
that the inverse-transformed values are practically indistinguishable from
the original response - the function uses the following iterative fitting
procedure to calculate a 'final' working response:
The working response is calculated from this model
The inverse transformation of the working response is then calculated
If the inverse transformation is effectively equal to the original response
(testing using all.equal
with the default tolerance), the working
response is returned; otherwise, the GLM is re-fit with the working
response now as the predictor, and steps 2-4 are repeated - each time with
an additional IWLS iteration
This approach will generate a very reasonable transformation of the
response variable, which will also closely resemble the direct
transformation where this can be compared - see Examples. It also ensures
that the transformed values, and hence the standard deviation, are the same
for any GLM fitting the same response - provided it uses the same link
function - and so facilitates model comparisons, selection, and averaging.