j_summ()
prints output for a regression model in a fashion similar to
summary()
, but formatted differently with more options.
j_summ(model, ...)# S3 method for lm
j_summ(model, standardize = FALSE, vifs = FALSE,
robust = FALSE, robust.type = "HC3", digits = 3, model.info = TRUE,
model.fit = TRUE, model.check = FALSE, n.sd = 1, center = FALSE,
standardize.response = FALSE, ...)
# S3 method for glm
j_summ(model, standardize = FALSE, vifs = FALSE, digits = 3,
model.info = TRUE, model.fit = TRUE, n.sd = 1, center = FALSE,
standardize.response = FALSE, ...)
# S3 method for svyglm
j_summ(model, standardize = FALSE, vifs = FALSE,
digits = 3, model.info = TRUE, model.fit = TRUE, model.check = FALSE,
n.sd = 1, center = FALSE, standardize.response = FALSE, ...)
# S3 method for merMod
j_summ(model, standardize = FALSE, digits = 3,
model.info = TRUE, model.fit = TRUE, n.sd = 1, center = FALSE,
standardize.response = FALSE, ...)
This just captures extra arguments that may only work for other types of models.
If TRUE
, adds a column to output with standardized regression
coefficients. Default is FALSE
.
If TRUE
, adds a column to output with variance inflation factors
(VIF). Default is FALSE
.
If TRUE
, reports heteroskedasticity-robust standard errors
instead of conventional SEs. These are also known as Huber-White standard
errors.
Default is FALSE
.
This requires the sandwich
and lmtest
packages to compute the
standard errors.
Only used if robust=TRUE
. Specifies the type of
robust standard errors to be used by sandwich
. By default, set to "HC3"
. See details for more on options.
An integer specifying the number of digits past the decimal to report in the output. Default is 5.
Toggles printing of basic information on sample size, name of DV, and number of predictors.
Toggles printing of R-squared, Adjusted R-squared, Pseudo-R-squared, and AIC (when applicable).
Toggles whether to perform Breusch-Pagan test for heteroskedasticity and print number of high-leverage observations. See details for more info.
If standardize = TRUE
, how many standard deviations should
predictors be divided by? Default is 1, though some suggest 2.
If you want coefficients for mean-centered variables but don't
want to standardize, set this to TRUE
.
Should standardization apply to response variable?
Default is FALSE
.
If saved, users can access most of the items that are returned in the output (and without rounding).
The outputted table of variables and coefficients
The model for which statistics are displayed. This would be
most useful in cases in which standardize = TRUE
.
Much other information can be accessed as attributes.
By default, this function will print the following items to the console:
The sample size
The name of the outcome variable
The number of predictors used
The (Pseudo-)R-squared value (plus adjusted R-squared if OLS regression).
A table with regression coefficients, standard errors, t-values, and p-values.
There are several options available for robust.type
. The heavy lifting
is done by vcovHC
, where those are better described.
Put simply, you may choose from "HC0"
to "HC5"
. Based on the
recommendation of the developers of sandwich, the default is set to
"HC3"
. Stata's default is "HC1"
, so that choice may be better
if the goal is to replicate Stata's output. Any option that is understood by
vcovHC
will be accepted.
The standardize
and center
options are performed via refitting
the model with scale_lm
and center_lm
,
respectively. Each of those in turn uses gscale
for the
mean-centering and scaling. These functions can handle svyglm
objects
correctly by calling svymean
and svyvar
to compute means and
standard deviations. Weights are not altered. The fact that the model is
refit means the runtime will be similar to the original time it took to fit
the model.
There are two pieces of information given for model.check
, provided that
the model is an lm
object. First, a Breusch-Pagan test is performed with
ncvTest
, which requires the car
package. This is a
hypothesis test for which the alternative hypothesis is heteroskedastic errors.
The test becomes much more likely to be statistically significant as the sample
size increases; however, the homoskedasticity assumption becomes less important
to inference as sample size increases (Lumley, Diehr, Emerson, & Lu, 2002).
Take the result of the test as a cue to check graphical checks rather than a
definitive decision. Note that the use of robust standard errors can account
for heteroskedasticity, though some oppose this approach (see King & Roberts,
2015).
The second piece of information provided by setting model.check
to
TRUE
is the number of high leverage observations. There are no hard
and fast rules for determining high leverage either, but in this case it is
based on Cook's Distance. All Cook's Distance values greater than (4/N) are
included in the count. Again, this is not a recommendation to locate and
remove such observations, but rather to look more closely with graphical and
other methods.
King, G., & Roberts, M. E. (2015). How robust standard errors expose methodological problems they do not fix, and what to do about it. Political Analysis, 23(2), 159<U+2013>179. https://doi.org/10.1093/pan/mpu015
Lumley, T., Diehr, P., Emerson, S., & Chen, L. (2002). The Importance of the Normality Assumption in Large Public Health Data Sets. Annual Review of Public Health, 23, 151<U+2013>169. https://doi.org/10.1146/annurev.publhealth.23.100901.140546
scale_lm
can simply perform the standardization if
preferred.
gscale
does the heavy lifting for mean-centering and scaling
behind the scenes.
# NOT RUN {
# Create lm object
fit <- lm(Income ~ Frost + Illiteracy + Murder, data = as.data.frame(state.x77))
# Print the output with standardized coefficients and 2 digits past the decimal
j_summ(fit, standardize = TRUE, digits = 2)
# With svyglm
library(survey)
data(api)
dstrat <- svydesign(id = ~1, strata =~ stype, weights =~ pw, data = apistrat,
fpc =~ fpc)
regmodel <- svyglm(api00 ~ ell * meals, design = dstrat)
j_summ(regmodel)
# }
Run the code above in your browser using DataLab