plot_model()
creates plots from regression models, either
estimates (as so-called forest or dot whisker plots) or marginal effects.
plot_model(model, type = c("est", "re", "eff", "pred", "int", "std", "std2",
"slope", "resid", "diag"), transform, terms = NULL, sort.est = NULL,
rm.terms = NULL, group.terms = NULL, order.terms = NULL,
pred.type = c("fe", "re"), mdrt.values = c("minmax", "meansd", "zeromax",
"quart", "all"), ri.nr = NULL, title = NULL, axis.title = NULL,
axis.labels = NULL, wrap.title = 50, wrap.labels = 25,
axis.lim = NULL, grid.breaks = NULL, ci.lvl = NULL, colors = "Set1",
show.intercept = FALSE, show.values = FALSE, show.p = TRUE,
show.data = FALSE, value.offset = NULL, value.size, digits = 2,
dot.size = NULL, line.size = NULL, vline.color = NULL, grid,
case = "parsed", auto.label = TRUE, bpe = "median",
bpe.style = "line", ...)get_model_data(model, type = c("est", "re", "eff", "pred", "int", "std",
"std2", "slope", "resid", "diag"), transform, terms = NULL,
sort.est = NULL, rm.terms = NULL, group.terms = NULL,
order.terms = NULL, pred.type = c("fe", "re"), ri.nr = NULL,
ci.lvl = NULL, colors = "Set1", grid, case = "parsed", digits = 2,
...)
A regression model object. Depending on the type
, many
kinds of models are supported, e.g. from packages like stats,
lme4, nlme, rstanarm, survey, glmmTMB,
MASS, brms etc.
Type of plot. There are three groups of plot-types: Coefficients
type = "est"
Forest-plot of estimates. If the fitted model only contains one predictor, slope-line is plotted.
type = "re"
For mixed effects models, plots the random effects.
type = "std"
Forest-plot of standardized beta values.
type = "std2"
Forest-plot of standardized beta values, however, standardization is done by dividing by two sd (see 'Details').
Marginal Effects
type = "pred"
Predicted values (marginal effects) for specific model terms. See ggpredict
for details.
type = "eff"
Similar to type = "pred"
, however, discrete predictors are held constant at their proportions (not reference level). See ggeffect
for details.
type = "int"
Marginal effects of interaction terms in model
.
Model diagnostics
type = "slope"
Slope of coefficients for each single predictor, against the response (linear relationship between each model term and response).
type = "resid"
Slope of coefficients for each single predictor, against the residuals (linear relationship between each model term and residuals).
type = "diag"
Check model assumptions.
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, transform
will
automatically use "exp"
as transformation for applicable classes
of model
(e.g. logistic or poisson regression). Estimates of linear
models remain untransformed. Use NULL
if you want the raw, non-transformed
estimates.
Character vector with the names of those terms from model
that should be plotted. This argument depends on the plot-type:
Select terms that should be plotted. All other term are removed from the output.
Here terms
indicates for which terms marginal effects
should be displayed. At least one term is required to calculate
effects, maximum length is three terms, where the second and
third term indicate the groups, i.e. predictions of first term
are grouped by the levels of the second (and third) term. terms
may also indicate higher order terms (e.g. interaction terms).
Indicating levels in square brackets allows for selecting only
specific groups. Term name and levels in brackets must be separated
by a whitespace character, e.g. terms = c("age", "education [1,3]")
.
For more details, see ggpredict
.
Determines in which way estimates are sorted in the plot:
If NULL
(default), no sorting is done and estimates are sorted in the same order as they appear in the model formula.
If TRUE
, estimates are sorted in descending order, with highedt estimate at the top.
If sort.est = "sort.all"
, estimates are re-sorted for each coefficient (only applies if type = "re"
and grid = FALSE
), i.e. the estimates of the random effects for each predictor are sorted and plotted to an own plot.
If type = "re"
, specify a predictor's / coefficient's name to sort estimates according to this random effect.
Character vector with names that indicate which terms should
be removed from the plot. Counterpart to terms
.
rm.terms = "t_name"
would remove the term t_name.
Default is NULL
, i.e. all terms are used. Note that this
argument does not apply to Marginal Effects plots.
Numeric vector with group indices, to group coefficients. Each group of coefficients gets its own color (see 'Examples').
Numeric vector, indicating in which order the coefficients should be plotted. See examples in this package-vignette.
Character, only applies for Marginal Effects plots
with mixed effects models. Indicates whether predicted values should be
conditioned on random effects (pred.type = "re"
) or fixed effects
only (pred.type = "fe"
, the default).
Indicates which values of the moderator variable should be
used when plotting interaction terms (i.e. type = "int"
).
"minmax"
(default) minimum and maximum values (lower and upper bounds) of the moderator are used to plot the interaction between independent variable and moderator(s).
"meansd"
uses the mean value of the moderator as well as one standard deviation below and above mean value to plot the effect of the moderator on the independent variable (following the convention suggested by Cohen and Cohen and popularized by Aiken and West, i.e. using the mean, the value one standard deviation above, and the value one standard deviation below the mean as values of the moderator, see Grace-Martin K: 3 Tips to Make Interpreting Moderation Effects Easier).
"zeromax"
is similar to the "minmax"
option, however, 0
is always used as minimum value for the moderator. This may be useful for predictors that don't have an empirical zero-value, but absence of moderation should be simulated by using 0 as minimum.
"quart"
calculates and uses the quartiles (lower, median and upper) of the moderator value.
"all"
uses all values of the moderator variable.
Numeric vector. If type = "re"
and fitted model has more
than one random intercept, ri.nr
indicates which random effects of
which random intercept (or: which list elements of ranef
)
will be plotted. Default is NULL
, so all random effects will be
plotted.
Character vector, used as plot title. By default,
get_dv_labels
is called to retrieve the
label of the dependent variable, which will be used as title. Use
title = ""
to remove title.
Character vector of length one or two (depending on
the plot function and type), used as title(s) for the x and y axis.
If not specified, a default labelling is chosen. Note:
Some plot types may not support this argument sufficiently. In such
cases, use the returned ggplot-object and add axis titles manually with
labs
. Use axis.title = ""
to remove axis
titles.
Character vector with labels for the model terms, used as
axis labels. By default, get_term_labels
is
called to retrieve the labels of the coefficients, which will be used as
axis labels. Use axis.labels = ""
or auto.label = FALSE
to use the variable names as labels instead.
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted.
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted.
Numeric vector of length 2, defining the range of the plot
axis. Depending on plot-type, may effect either x- or y-axis. For
Marginal Effects plots, axis.lim
may also be a list of two
vectors of length 2, defining axis limits for both the x and y axis.
Numeric; sets the distance between breaks for the axis,
i.e. at every grid.breaks
'th position a major grid is plotted.
Numeric, the level of the confidence intervals (error bars).
Use ci.lvl = NA
to remove error bars. For stanreg
-models,
ci.lvl
defines the (outer) probability for the hdi
(High Density Interval) that is plotted. By default, stanreg
-models
are printed with two intervals: the "inner" interval, which defaults to
the 50%-HDI; and the "outer" interval, which defaults to the 89%-HDI.
ci.lvl
affects only the outer interval in such cases. See
prob.inner
and prob.outer
under the ...
-argument
for more details.
May be a character vector of color values in hex-format, valid
color value names (see demo("colors")
) or a name of a
color brewer palette. Following options
are valid for the colors
argument:
If not specified, a default color brewer palette will be used, which is suitable for the plot style.
If "gs"
, a greyscale will be used.
If "bw"
, and plot-type is a line-plot, the plot is black/white and uses different line types to distinguish groups (see this package-vignette).
If colors
is any valid color brewer palette name, the related palette will be used. Use display.brewer.all
to view all available palette names.
If wesanderson is installed, you may also specify a name of a palette from that package.
Else specify own color values or names as vector (e.g. colors = "#00ff00"
).
Logical, if TRUE
, the intercept of the fitted
model is also plotted. Default is FALSE
. If transform = "exp"
,
please note that due to exponential transformation of estimates, the
intercept in some cases is non-finite and the plot can not be created.
Logical, whether values should be plotted or not.
Logical, adds asterisks that indicate the significance level of estimates to the value labels.
Logical, for Marginal Effects plots, also plots the raw data points.
Numeric, offset for text labels to adjust their position relative to the dots or lines.
Numeric, indicates the size of value labels. Can be used
for all plot types where the argument show.values
is applicable,
e.g. value.size = 4
.
Numeric, amount of digits after decimal point when rounding estimates or values.
Numeric, size of the dots that indicate the point estimates.
Numeric, size of the lines that indicate the error bars.
Color of the vertical "zero effect" line. Default color is inherited from the current theme.
Logical, if TRUE
, multiple plots are plotted as grid layout.
Desired target case. Labels will automatically converted into the
specified character case. See to_any_case
for
more details on this argument.
Logical, if TRUE
(the default), plot-labels are based
on value and variable labels, if the data is labelled. See
get_label
and get_term_labels
for details.
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is, by default, the
median of the posterior distribution. Use bpe
to define other
functions to calculate the Bayesion point estimate. bpe
needs to
be a character naming the specific function, which is passed to the
fun
-argument in typical_value
. So,
bpe = "mean"
would calculate the mean value of the posterior
distribution.
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is indicated as a small,
vertical line by default. Use bpe.style = "dot"
to plot a dot
instead of a line for the point estimate.
Other arguments, passed down to various functions. Here is a list of supported arguments and their description in detail.
prob.inner
and prob.outer
For Stan-models (fitted with the rstanarm- or
brms-package) and coefficients plot-types, you can specify
numeric values between 0 and 1 for prob.inner
and
prob.outer
, which will then be used as inner and outer
probabilities for the uncertainty intervals (HDI). By default, the
inner probability is 0.5 and the outer probability is 0.89 (unless
ci.lvl
is specified - in this case, ci.lvl
is used as
outer probability).
size.inner
For Stan-models and Coefficients plot-types, you
can specify the width of the bar for the inner probabilities.
Default is 0.1
.
width
, alpha
and scale
Passed down to geom_errorbar()
or geom_density_ridges()
,
for forest or diagnostic plots; or passed down to
plot.ggeffects
for Marginal Effects plots.
show.loess
Logical, for diagnostic plot-types "slope"
and "resid"
,
adds (or hides) a loess-smoothed line to the plot.
When plotting marginal effects, arguments are also passed down to
ggpredict
, ggeffect
or plot.ggeffects
.
Depending on the plot-type, plot_model()
returns a
ggplot
-object or a list of such objects. get_model_data
returns the associated data with the plot-object as tidy data frame,
or (depending on the plot-type) a list of such data frames.
get_model_data
simply calls plot_model()
and returns
the data from the ggplot-object. Hence, it is rather inefficient and should
be used as alternative to brooms tidy()
-function only in
specific situations.
Some notes on the different plot-types:
type = "std2"
Plots standardized beta values, however, standardization follows
Gelman's (2008) suggestion, rescaling the estimates by dividing them
by two standard deviations instead of just one. Resulting coefficients
are then directly comparable for untransformed binary predictors. This
standardization uses the standardize
-function from
the arm-package.
type = "pred"
Plots marginal effects. Simply wraps ggpredict
.
type = "eff"
Plots marginal effects. Simply wraps ggeffect
.
type = "int"
A shortcut for marginal effects plots, where interaction terms are
automatically detected and used as `terms`-argument. Furthermore,
if the moderator variable (the second - and third - term in an interaction)
is continuous, type = "int"
automatically chooses useful values
based on the `mdrt.values`-argument, which are passed to `terms`. Then,
ggpredict
is called. type = "int"
plots
the interaction term that appears first in the formula along the x-axis,
while the second (and possibly third) variable in an interaction is
used as grouping factor(s) (moderating variable). Use type = "pred"
or type = "eff"
and specify a certain order in the `terms`-argument
to indicate which variable(s) should be used as moderator.
Gelman A (2008) "Scaling regression inputs by dividing by two standard deviations." Statistics in Medicine 27: 2865<U+2013>2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf
Package-vignette about plot_model().
# NOT RUN {
# prepare data
library(sjmisc)
data(efc)
efc <- to_factor(efc, c161sex, e42dep, c172code)
m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc)
# simple forest plot
plot_model(m)
# grouped coefficients
plot_model(m, group.terms = c(1, 2, 3, 3, 3, 4, 4))
# multiple plots, as returned from "diagnostic"-plot type,
# can be arranged with 'plot_grid()'
# }
# NOT RUN {
p <- plot_model(m, type = "diag")
plot_grid(p)
# }
# NOT RUN {
# plot random effects
library(lme4)
m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
plot_model(m, type = "re")
# plot marginal effects
plot_model(m, type = "eff", terms = "Days")
# plot interactions
# }
# NOT RUN {
m <- glm(
tot_sc_e ~ c161sex + c172code * neg_c_7,
data = efc,
family = poisson()
)
# type = "int" automatically selects groups for continuous moderator
# variables - see argument 'mdrt.values'. The following function call is
# identical to:
# plot_model(m, type = "pred", terms = c("c172code", "neg_c_7 [7,28]"))
plot_model(m, type = "int")
# switch moderator
plot_model(m, type = "pred", terms = c("neg_c_7", "c172code"))
# same as
# ggeffects::ggpredict(m, terms = c("neg_c_7", "c172code"))
# }
# NOT RUN {
# plot Stan-model
# }
# NOT RUN {
if (require("rstanarm")) {
data(mtcars)
m <- stan_glm(mpg ~ wt + am + cyl + gear, data = mtcars, chains = 1)
plot_model(m, bpe.style = "dot")
}
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab