by_2sd
rescales regression results to facilitate making dot-and-whisker plots using dwplot
.
by_2sd(df, dataset)
A tidy data frame
A data frame including the variables term
(names of independent variables), estimate
(corresponding coefficient estimates), std.error
(corresponding standard errors), and optionally model
(when multiple models are desired on a single plot) such as generated those by tidy
.
The data analyzed in the models whose results are recorded in df
, or (preferably) the model matrix used by the models in df
; the information required for complex models can more easily be generated from the model matrix than from the original data set. In many cases the model matrix can be extracted from the original model via model.matrix
.
by_2sd
multiplies the results from regression models saved as tidy data frames for predictors that are not binary by twice the standard deviation of these variables in the dataset analyzed. Standardizing in this way yields coefficients that are directly comparable to each other and to those for untransformed binary predictors (Gelman 2008) and so facilitates plotting using dwplot
. Note that the current version of by_2sd
does not subtract the mean (in contrast to Gelman's (2008) formula). However, all estimates and standard errors of the independent variables are the same as if the mean was subtracted. The only difference from Gelman (2008) is that for all variables in the model the intercept is shifted by the coefficient times the mean of the variable.
An alternative available in some circumstances is to pass a model object to arm::standardize
before passing the results to tidy
and then on to dwplot
. The advantages of by_2sd
are that (1) it takes a tidy data frame as its input and so is not restricted to only those model objects that standardize
accepts and (2) it is much more efficient because it operates on the parameters rather than refitting the original model with scaled data.
Gelman, Andrew. 2008. "Scaling Regression Inputs by Dividing by Two Standard Deviations." Statistics in Medicine, 27:2865-2873.
library(broom)
library(dplyr)
data(mtcars)
m1 <- lm(mpg ~ wt + cyl + disp, data = mtcars)
m1_df <- tidy(m1) %>% by_2sd(mtcars) # create data frame of rescaled regression results
Run the code above in your browser using DataLab