In insight::get_predicted()
, the predict
argument jointly
modulates two separate concepts, the scale and the uncertainty interval.
Confidence Interval (CI) vs. Prediction Interval (PI))
Linear models - lm()
: For linear models, Prediction
intervals (predict="prediction"
) show the range that likely
contains the value of a new observation (in what range it is likely to
fall), whereas confidence intervals (predict="expectation"
or
predict="link"
) reflect the uncertainty around the estimated
parameters (and gives the range of uncertainty of the regression line). In
general, Prediction Intervals (PIs) account for both the uncertainty in the
model's parameters, plus the random variation of the individual values.
Thus, prediction intervals are always wider than confidence intervals.
Moreover, prediction intervals will not necessarily become narrower as the
sample size increases (as they do not reflect only the quality of the fit,
but also the variability within the data).
Generalized Linear models - glm()
: For binomial models,
prediction intervals are somewhat useless (for instance, for a binomial
(Bernoulli) model for which the dependent variable is a vector of 1s and
0s, the prediction interval is... [0, 1]
).
Link scale vs. Response scale
When users set the predict
argument to "expectation"
, the predictions
are returned on the response scale, which is arguably the most convenient
way to understand and visualize relationships of interest. When users set
the predict
argument to "link"
, predictions are returned on the link
scale, and no transformation is applied. For instance, for a logistic
regression model, the response scale corresponds to the predicted
probabilities, whereas the link-scale makes predictions of log-odds
(probabilities on the logit scale). Note that when users select
predict="classification"
in binomial models, the get_predicted()
function will first calculate predictions as if the user had selected
predict="expectation"
. Then, it will round the responses in order to
return the most likely outcome.
Heteroscedasticity consistent standard errors
The arguments vcov_estimation
, vcov_type
and vcov_args
can be used
to calculate robust standard errors for confidence intervals of predictions.
These arguments, when provided in get_predicted()
, are passed down to
get_predicted_ci()
, thus, see the related documentation there for more
details.