Posterior predictive checks mean “simulating replicated data under the fitted model and then comparing these to the observed data” (Gelman and Hill, 2007, p. 158). Posterior predictive checks can be used to “look for systematic discrepancies between real and simulated data” (Gelman et al. 2014, p. 169).
performance provides posterior predictive check methods for a variety
of frequentist models (e.g., lm
, merMod
, glmmTMB
, ...). For Bayesian
models, the model is passed to bayesplot::pp_check()
.
check_predictions(
object,
iterations = 50,
check_range = FALSE,
re_formula = NULL,
...
)posterior_predictive_check(
object,
iterations = 50,
check_range = FALSE,
re_formula = NULL,
...
)
check_posterior_predictions(
object,
iterations = 50,
check_range = FALSE,
re_formula = NULL,
...
)
A statistical model.
The number of draws to simulate/bootstrap.
Logical, if TRUE
, includes a plot with the minimum
value of the original response against the minimum values of the replicated
responses, and the same for the maximum value. This plot helps judging whether
the variation in the original data is captured by the model or not
(Gelman et al. 2020, pp.163). The minimum and maximum values of y
should
be inside the range of the related minimum and maximum values of yrep
.
Formula containing group-level effects (random effects) to
be considered in the simulated data. If NULL
(default), condition
on all random effects. If NA
or ~0
, condition on no random
effects. See simulate()
in lme4.
Passed down to simulate()
.
A data frame of simulated responses and the original response vector.
An example how posterior predictive checks can also be used for model comparison is Figure 6 from Gabry et al. 2019, Figure 6.
The model shown in the right panel (b) can simulate new data that are more similar to the observed outcome than the model in the left panel (a). Thus, model (b) is likely to be preferred over model (a).
Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., & Gelman, A. (2019). Visualization in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society), 182(2), 389<U+2013>402. https://doi.org/10.1111/rssa.12378
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge; New York: Cambridge University Press.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2014). Bayesian data analysis. (Third edition). CRC Press.
Gelman, A., Hill, J., & Vehtari, A. (2020). Regression and Other Stories. Cambridge University Press.
# NOT RUN {
library(performance)
model <- lm(mpg ~ disp, data = mtcars)
if (require("see")) {
check_predictions(model)
}
# }
Run the code above in your browser using DataLab