summaryvglm: Summarizing Vector Generalized Linear Model Fits

Description

These functions are all methods for class vglm or summary.vglm objects.

Usage

summaryvglm(object, correlation = FALSE, dispersion = NULL,
     digits = NULL, presid = FALSE,
     HDEtest = TRUE, hde.NA = TRUE, threshold.hde = 0.001,
     signif.stars = getOption("show.signif.stars"),
     nopredictors = FALSE,
     lrt0.arg = FALSE, score0.arg = FALSE, wald0.arg = FALSE,
     values0 = 0, subset = NULL, omit1s = TRUE,
     wsdm.arg = FALSE, hdiff = 0.005,
     retry = TRUE, mux.hdiff = 1, eps.wsdm = 0.15,
     Mux.div = 3, doffset.wsdm = NULL, ...)
# S3 method for summary.vglm
show(x, digits = max(3L, getOption("digits") - 3L),
     quote = TRUE, prefix = "", presid = length(x@pearson.resid) > 0,
     HDEtest = TRUE, hde.NA = TRUE, threshold.hde = 0.001,
     signif.stars = NULL, nopredictors = NULL,
     top.half.only = FALSE, ...)

Value

summaryvglm returns an object of class "summary.vglm"; see summary.vglm-class.

Arguments

object

an object of class "vglm", usually, a result of a call to vglm.

x

an object of class "summary.vglm", usually, a result of a call to summaryvglm().

dispersion

used mainly for GLMs. See summary.glm. This argument should not be used because VGAM now steers away from quasi-likelihood models.

correlation

logical; if TRUE, the correlation matrix of the estimated parameters is returned and printed.

digits

the number of significant digits to use when printing.

signif.stars: logical; if TRUE, ‘significance stars’ are printed for each coefficient.

presid: Pearson residuals; print out some summary statistics of these?
HDEtest: logical; if TRUE (the default) then a test for the HDE is performed, else all arguments related to the HDE are ignored.
hde.NA: logical; if a test for the Hauck-Donner effect is done (for each coefficient) and it is affirmative should that Wald test p-value be replaced by an NA? The default is to do so. Setting hde.NA = FALSE will print the p-value even though it will be biased upwards. Also see argument threshold.hde.
threshold.hde: numeric; used if hde.NA = TRUE and is present for some coefficients. Only p-values greater than this argument will be replaced by an NA, the reason being that small p-values will already be statistically significant. Hence setting threshold.hde = 0 will print out a NA if the HDE is present.
quote: Fed into print().
nopredictors: logical; if TRUE the names of the linear predictors are not printed out. The default is that they are.
lrt0.arg, score0.arg, wald0.arg: logical; if lrt0.arg = TRUE then the other arguments are passed into lrt.stat.vlm and the equivalent of the so-called Wald table is outputted. Similarly, if score0.arg = TRUE then the other arguments are passed into score.stat.vlm and the equivalent of the so-called Wald table is outputted. Similarly, if wald0.arg = TRUE then the other arguments are passed into wald.stat.vlm and the Wald table corresponding to that is outputted. See details below. Setting any of these will result in further IRLS iterations being performed, therefore may be computationally expensive.
values0, subset, omit1s: These arguments are used if any of the lrt0.arg, score0.arg, wald0.arg arguments are used. They are passed into the appropriate function, such as wald.stat.vlm.
top.half.only: logical; if TRUE then only print out the top half of the usual output. Used for P-VGAMs.
prefix: Not used.
wsdm.arg: logical; compute the WSDM statistics? If so, wsdm is called and they are printed as a new fifth column. Also printed is the max-WSDM statistic at the bottom. See hdiff about choosing a suitable \(h\). Note that the arguments supplied here is a subset of those of wsdm, hence a more detailed WSDM analysis should be conducted by calling wsdm directly as well.
hdiff: numeric; fed into wsdm. An important argument if wsdm.arg = TRUE. If it is too small or large then the max-WSDM statistic will be described as "inaccurate" in which case trying another value is advised.
retry: logical; fed into wsdm. If TRUE then the computation will take three times longer in order to confirm the reasonable accuracy of the WSDM statistics.
mux.hdiff: fed into wsdm.
eps.wsdm, Mux.div: fed into wsdm.
doffset.wsdm: numeric; fed into wsdm. The default means the vector is searched for on object (such as logistic regression). If nothing is found, then a vector of 1s is used.
...: Not used.

Author

T. W. Yee.

Warning

Currently the SE column is deleted when lrt0 = TRUE because SEs are not so meaningful with the LRT. In the future an SE column may be inserted (with NA values) so that it has 4-column output like the other tests. In the meantime, the columns of this matrix should be accessed by name and not number.

Details

Originally, summaryvglm() was written to be very similar to summary.glm, however now there are a quite a few more options available. By default, show.summary.vglm() tries to be smart about formatting the coefficients, standard errors, etc. and additionally gives ‘significance stars’ if signif.stars is TRUE. The coefficients component of the result gives the estimated coefficients and their estimated standard errors, together with their ratio. This third column is labelled z value regardless of whether the dispersion is estimated or known (or fixed by the family). A fourth column gives the two-tailed p-value corresponding to the z ratio based on a Normal reference distribution. In general, the t distribution is not used, but the normal distribution is.

Correlations are printed to two decimal places (or symbolically): to see the actual correlations print summary(object)@correlation directly.

The Hauck-Donner effect (HDE) is tested for almost all models; see hdeff.vglm for details. Arguments hde.NA and threshold.hde here are meant to give some control of the output if this aberration of the Wald statistic occurs (so that the p-value is biased upwards). If the HDE is present then using lrt.stat.vlm to get a more accurate p-value is a good alternative as p-values based on the likelihood ratio test (LRT) tend to be more accurate than Wald tests and do not suffer from the HDE. Alternatively, if the HDE is present then using wald0.arg = TRUE will compute Wald statistics that are HDE-free; see wald.stat.

The arguments lrt0.arg and score0.arg enable the so-called Wald table to be replaced by the equivalent LRT and Rao score test table; see lrt.stat.vlm, score.stat. Further IRLS iterations are performed for both of these, hence the computational cost might be significant.

It is possible for programmers to write a methods function to print out extra quantities when summary(vglmObject) is called. The generic function is summaryvglmS4VGAM(), and one can use the S4 function setMethod to compute the quantities needed. Also needed is the generic function is showsummaryvglmS4VGAM() to actually print the quantities out.

Examples

Run this code

## For examples see example(glm)
pneumo <- transform(pneumo, let = log(exposure.time))
(afit <- vglm(cbind(normal, mild, severe) ~ let, acat, pneumo))
coef(afit, matrix = TRUE)
summary(afit)  # Might suffer from the HDE?
coef(summary(afit))
summary(afit, lrt0 = TRUE, score0 = TRUE, wald0 = TRUE)
summary(afit, wsdm = TRUE, hdiff = 0.1)

Run the code above in your browser using DataLab