sjp.lm: Plot estimates, predictions or effects of linear models

Description

Depending on the type, this function plots coefficients (estimates) of linear regressions (including panel models fitted with the plm-function from the plm-package and generalized least squares models fitted with the gls-function from the nlme-package) with confidence intervals as dot plot (forest plot), model assumptions for linear models or slopes and scatter plots for each single coefficient. See type for details.

Usage

sjp.lm(fit, type = "lm", vars = NULL, group.estimates = NULL, remove.estimates = NULL, sort.est = TRUE, poly.term = NULL, title = NULL, legend.title = NULL, axis.labels = NULL, axis.title = NULL, resp.label = NULL, geom.size = NULL, geom.colors = "Set1", point.alpha = 0.2, scatter.plot = TRUE, wrap.title = 50, wrap.labels = 25, axis.lim = NULL, grid.breaks = NULL, show.values = TRUE, show.p = TRUE, show.ci = TRUE, show.legend = FALSE, show.loess = FALSE, show.loess.ci = FALSE, show.summary = FALSE, digits = 2, vline.type = 2, vline.color = "grey70", coord.flip = TRUE, y.offset = 0.15, facet.grid = TRUE, complete.dgns = FALSE, prnt.plot = TRUE, ...)

Arguments

fit

fitted linear regression model (of class lm, gls or plm).

type

type of plot. Use one of following:

vars

numeric vector with column indices of selected variables or a character vector with variable names of selected variables from the fitted model, which should be used to plot estimates, fixed effects slopes (for lmer) or probability or incidents curves (for glmer) of random intercepts.

group.estimates

numeric or character vector, indicating a group identifier for each estimate. Dots and confidence intervals of estimates are coloured according to their group association. See 'Examples'.

remove.estimates

character vector with coefficient names that indicate which estimates should be removed from the plot. remove.estimates = "est_name" would remove the estimate est_name. Default is NULL, i.e. all estimates are printed.

sort.est

logical, determines whether estimates should be sorted according to their values. If group.estimates is not NULL, estimates are sorted according to their group assignment.

poly.term

name of a polynomial term in fit as string. Needs to be specified, if type = "poly", in order to plot marginal effects for polynomial terms. See 'Examples'.

title

character vector, used as plot title. Depending on plot type and function, will be set automatically. If title = "", no title is printed.

legend.title

character vector, used as title for the plot legend. Note that only some plot types have legends (e.g. type = "pred" or when grouping estimates with group.estimates).

axis.labels

character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically.

axis.title

character vector of length one or two (depending on the plot function and type), used as title(s) for the x and y axis. If not specified, a default labelling is chosen.

resp.label

name of dependent variable, as string. Only used if fitted model has only one predictor and type = "lm".

geom.size

size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes.

geom.colors

user defined color palette for geoms. If group.estimates is not specified, must either be vector with two color values or a specific color palette code (see 'Details' in sjp.grpfrq). Else, if group.estimates is specified, geom.colors must be a vector of same length as groups. See 'Examples'.

point.alpha

alpha value of point-geoms in the scatter plots.

scatter.plot

logical, if TRUE (default), a scatter plot of response and predictor values for each predictor of the model is plotted. Only applies for slope-type plots.

wrap.title

numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted.

wrap.labels

numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted.

axis.lim

numeric vector of length 2, defining the range of the plot axis. Depending on plot type, may effect either x- or y-axis, or both. For multiple plot outputs (e.g., from type = "eff" or type = "slope" in sjp.glm), axis.lim may also be a list of vectors of length 2, defining axis limits for each plot (only if non-faceted).

grid.breaks

numeric; sets the distance between breaks for the axis, i.e. at every grid.breaks'th position a major grid is being printed.

show.values

logical, whether values should be plotted or not.

show.p

logical, adds significance levels to values, or value and variable labels.

show.ci

logical, if TRUE, depending on type, a confidence interval or region is added to the plot.

show.legend

logical, if TRUE, and depending on plot type and function, a legend is added to the plot.

show.loess

logical, if TRUE, and depending on type, an additional loess-smoothed line is plotted.

show.loess.ci

logical, if TRUE, a confidence region for the loess-smoothed line will be plotted. Default is FALSE. Only applies, if show.loess = TRUE (and for sjp.lmer, only applies if type = "fe.slope" or type = "fe.resid").

show.summary

logical, if TRUE, a summary with model statistics is added to the plot.

digits

numeric, amount of digits after decimal point when rounding estimates and values.

vline.type

linetype of the vertical "zero point" line. Default is 2 (dashed line).

vline.color

color of the vertical "zero point" line. Default value is "grey70".

coord.flip

logical, if TRUE, the x and y axis are swapped.

y.offset

numeric, offset for text labels when their alignment is adjusted to the top/bottom of the geom (see hjust and vjust).

facet.grid

TRUE to arrange the lay out of of multiple plots in a grid of an integrated single plot. This argument calls facet_wrap or facet_grid to arrange plots. Use plot_grid to plot multiple plot-objects as an arranged grid with grid.arrange.

complete.dgns

logical, if TRUE, additional tests are performed. Default is FALSE Only applies if type = "ma".

prnt.plot

logical, if TRUE (default), plots the results as graph. Use FALSE if you don't want to plot any graphs. In either case, the ggplot-object will be returned as value.

...

other arguments, passed down to the effect resp. allEffects function when type = "eff".

Value

Depending on the type, in most cases (insisibily) returns the ggplot-object with the complete plot (plot) as well as the data frame that was used for setting up the ggplot-object (df). For type = "ma", an updated model with removed outliers is returned.

Details

References

Gelman A (2008) "Scaling regression inputs by dividing by two standard deviations." Statistics in Medicine 27: 2865–2873. http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf Hyndman RJ, Athanasopoulos G (2013) "Forecasting: principles and practice." OTexts; accessed from https://www.otexts.org/fpp/5/4.

Examples

Run this code

# --------------------------------------------------
# plotting estimates of linear models as forest plot
# --------------------------------------------------
# fit linear model
fit <- lm(airquality$Ozone ~ airquality$Wind + airquality$Temp + airquality$Solar.R)

# plot estimates with CI
sjp.lm(fit, grid.breaks = 2)

# plot estimates with CI
# and with narrower tick marks
# (because "grid.breaks" was not specified)
sjp.lm(fit)

# ---------------------------------------------------
# plotting regression line of linear model (done
# automatically if fitted model has only 1 predictor)
# ---------------------------------------------------
library(sjmisc)
data(efc)
# fit model
fit <- lm(neg_c_7 ~ quol_5, data=efc)
# plot regression line with label strings
sjp.lm(fit, resp.label = "Burden of care",
       axis.labels = "Quality of life", show.loess = TRUE)

# --------------------------------------------------
# plotting regression lines of each single predictor
# of a fitted model
# --------------------------------------------------
library(sjmisc)
data(efc)
# fit model
fit <- lm(tot_sc_e ~ c12hour + e17age + e42dep, data=efc)

# reression line and scatter plot
sjp.lm(fit, type = "slope")

# reression line w/o scatter plot
sjp.lm(fit, type = "slope", scatter.plot = FALSE)

# --------------------------
# plotting model assumptions
# --------------------------
sjp.lm(fit, type = "ma")

## Not run: 
# # --------------------------
# # grouping estimates
# # --------------------------
# library(sjmisc)
# data(efc)
# fit <- lm(barthtot ~ c160age + e17age + c12hour + e16sex + c161sex + c172code,
#           data = efc)
# 
# # order estimates according to coefficient's order
# sjp.lm(fit, group.estimates = c(1, 1, 2, 3, 3, 4),
#        geom.colors = c("green", "red", "blue", "grey"), sort.est = FALSE)
# 
# fit <- lm(barthtot ~ c160age + c12hour + e17age+ c161sex + c172code + e16sex,
#           data = efc)
# 
# # force order of estimates according to group assignment
# sjp.lm(fit, group.estimates = c(1, 2, 1, 3, 4, 3),
#        geom.colors = c("green", "red", "blue", "grey"), sort.est = TRUE)
# 
# # --------------------------
# # predicted values for response
# # --------------------------
# library(sjmisc)
# data(efc)
# efc$education <- to_label(to_factor(efc$c172code))
# fit <- lm(barthtot ~ c160age + c12hour + e17age+ education,
#           data = efc)
# 
# sjp.lm(fit, type = "pred", vars = "c160age")
# 
# # with loess
# sjp.lm(fit, type = "pred", vars = "e17age", show.loess = TRUE)
# 
# # grouped
# sjp.lm(fit, type = "pred", vars = c("c12hour", "education"))
# 
# # grouped, non-facet
# sjp.lm(fit, type = "pred", vars = c("c12hour", "education"),
#        facet.grid = FALSE)
# 
# # --------------------------
# # plotting polynomial terms
# # --------------------------
# library(sjmisc)
# data(efc)
# # fit sample model
# fit <- lm(tot_sc_e ~ c12hour + e17age + e42dep, data = efc)
# # "e17age" does not seem to be linear correlated to response
# # try to find appropiate polynomial. Grey line (loess smoothed)
# # indicates best fit. Looks like x^3 has a good fit.
# # (not checked for significance yet).
# sjp.poly(fit, "e17age", 2:4, scatter.plot = FALSE)
# # fit new model
# fit <- lm(tot_sc_e ~ c12hour + e42dep +
#           e17age + I(e17age^2) + I(e17age^3),
#           data = efc)
# # plot marginal effects of polynomial term
# sjp.lm(fit, type = "poly", poly.term = "e17age")
# 
# library(splines)
# # fit new model with "splines"-package, "bs"
# fit <- lm(tot_sc_e ~ c12hour + e42dep + bs(e17age, 3), data = efc)
# # plot marginal effects of polynomial term, same call as above
# sjp.lm(fit, type = "poly", poly.term = "e17age")## End(Not run)

Run the code above in your browser using DataLab