Qvar: Quasi-variances Preprocessing Function

Description

Takes a vglm fit or a variance-covariance matrix, and preprocesses it for rcim and normal1 so that quasi-variances can be computed.

Usage

Qvar(object, factorname = NULL, which.eta = 1,
     coef.indices = NULL, labels = NULL,
     dispersion = NULL, reference.name = "(reference)", estimates = NULL)

Arguments

object

A "vglm" object or a variance-covariance matrix, e.g., vcov(vglm.object). The former is preferred since it contains all the information needed. If a matrix then fa

which.eta

A single integer from the set 1:M. Specifies which linear predictor to use. Let the value of which.eta be called $j$. Then the factor should appear in that linear predictor, hence the $j$th row of the constraint matrix co

factorname

Character. If the vglm object contains more than one factor as explanatory variable then this argument should be the name of the factor of interest. If object is a variance-covariance matr

labels

Character. Optional, for labelling the variance-covariance matrix.

dispersion

Numeric. Optional, passed into vcov() with the same argument name.

reference.name

Character. Label for for the reference level.

coef.indices

Optional numeric vector of length at least 3 specifying the indices of the factor from the variance-covariance matrix.

estimates

an optional vector of estimated coefficients (redundant if object is a model).

Value

A $L$ by $L$ matrix whose $i$-$j$ element is the logarithm of the variance of the $i$th coefficient minus the $j$th coefficient, for all values of $i$ and $j$. The diagonal elements are abitrary and are set to zero.
The matrix has an attribute that corresponds to the prior weight matrix; it is accessed by normal1 and replaces the usual weights argument. of vglm. This weight matrix has ones on the off-diagonals and some small positive number on the diagonals.

Warning

Negative quasi-variances may occur (one of them and only one), though they are rare in practice. If so then numerical problems may occur. See qvcalc() for more information.

Details

Suppose a factor with $L$ levels is an explanatory variable in a regression model. By default, R treats the first level as baseline so that its coefficient is set to zero. It estimates the other $L-1$ coefficients, and with its associated standard errors, this is the conventional output. From the complete variance-covariance matrix one can compute $L$ quasi-variances based on all pairwise difference of the coefficients. They are based on an approximation, and can be treated as uncorrelated. In minimizing the relative (not absolute) errors it is not hard to see that the estimation involves a RCIM (rcim) with an exponential link function (explink).

If object is a model, then at least one of factorname or coef.indices must be non-NULL. The value of coef.indices, if non-NULL, determines which rows and columns of the model's variance-covariance matrix to use. If coef.indices contains a zero, an extra row and column are included at the indicated position, to represent the zero variances and covariances associated with a reference level. If coef.indices is NULL, then factorname should be the name of a factor effect in the model, and is used in order to extract the necessary variance-covariance estimates.

Quasi-variances were first implemented in R with qvcalc. This implementation draws heavily from that.

References

Firth, D. (2003) Overcoming the reference category problem in the presentation of statistical models. Sociological Methodology 33, 1--18.

Firth, D. and de Menezes, R. X. (2004) Quasi-variances. Biometrika 91, 65--80.

Examples

Run this code

# Example 1
data("ships", package = "MASS")

Shipmodel <- vglm(incidents ~ type + year + period,
                  quasipoissonff, offset = log(service),
#                 trace = TRUE, model = TRUE,
                  data = ships, subset = (service > 0))

# Easiest form of input
fit1 <- rcim(Qvar(Shipmodel, "type"), normal1("explink"), maxit = 99)
(quasiVar <- exp(diag(fitted(fit1))) / 2)                 # Version 1
(quasiVar <- diag(predict(fit1)[, c(TRUE, FALSE)]) / 2)   # Version 2
(quasiSE  <- sqrt(quasiVar))

# Another form of input
fit2 <- rcim(Qvar(Shipmodel, coef.ind = c(0,2:5), reference.name = "typeA"),
             normal1("explink"), maxit = 99)
plotqvar(fit2, col = "green", lwd = 3, scol = "blue", slwd = 2, las = 1)

# The variance-covariance matrix is another form of input (not recommended)
fit3 <- rcim(Qvar(cbind(0, rbind(0, vcov(Shipmodel)[2:5, 2:5])),
                  labels = c("typeA", "typeB", "typeC", "typeD", "typeE"),
                  estimates = c(typeA = 0, coef(Shipmodel)[2:5])),
             normal1("explink"), maxit = 99)
(QuasiVar <- exp(diag(fitted(fit3))) / 2)                 # Version 1
(QuasiVar <- diag(predict(fit3)[, c(TRUE, FALSE)]) / 2)   # Version 2
(QuasiSE  <- sqrt(quasiVar))
plotqvar(fit3)


# Example 2: a model with M > 1 linear predictors
require(VGAMdata)
xs.nz.f <- subset(xs.nz, sex == "F")
xs.nz.f <- subset(xs.nz.f, !is.na(babies)  & !is.na(age) & !is.na(ethnic))
xs.nz.f$babies <- as.numeric(as.character(xs.nz.f$babies))
xs.nz.f <- subset(xs.nz.f, babies >=  0)
xs.nz.f <- subset(xs.nz.f, as.numeric(as.character(ethnic)) <=  2)

clist <- list("bs(age, df = 4)" = rbind(1, 0),
              "bs(age, df = 3)" = rbind(0, 1),
              "ethnic" = diag(2),
              "(Intercept)" = diag(2))
fit1 <- vglm(babies ~ bs(age, df = 4) + bs(age, df = 3) + ethnic,
            zipoissonff(zero = NULL), xs.nz.f,
            constraints = clist, trace = TRUE)
Fit1 <- rcim(Qvar(fit1, "ethnic", which.eta = 1),
             normal1("explink", imethod = 1), maxit = 99, trace = TRUE)
Fit2 <- rcim(Qvar(fit1, "ethnic", which.eta = 2),
             normal1("explink", imethod = 1), maxit = 99, trace = TRUE)
par(mfrow = c(1, 2))
plotqvar(Fit1, scol = "blue", pch = 16,
         main = expression(eta[1]),
         slwd = 1.5, las = 1, length.arrows = 0.07)
plotqvar(Fit2, scol = "blue", pch = 16,
         main = expression(eta[2]),
         slwd = 1.5, las = 1, length.arrows = 0.07)

Run the code above in your browser using DataLab