rrvglm: Fitting Reduced-Rank Vector Generalized Linear Models (RR-VGLMs)

Description

A reduced-rank vector generalized linear model (RR-VGLM) is fitted. RR-VGLMs are VGLMs but some of the constraint matrices are estimated. In this documentation, $M$ is the number of linear predictors.

Usage

rrvglm(formula, family, data = list(), weights = NULL, subset = NULL,
       na.action = na.fail, etastart = NULL, mustart = NULL,
       coefstart = NULL, control = rrvglm.control(...), offset = NULL,
       method = "rrvglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
       contrasts = NULL, constraints = NULL, extra = NULL,
       qr.arg = FALSE, smart = TRUE, ...)

Arguments

Value

An object of class "rrvglm", which has the the same slots as a "vglm" object. The only difference is that the some of the constraint matrices are estimates rather than known. But VGAM stores the models the same internally. The slots of "vglm" objects are described in vglm-class.

Details

The central formula is given by $$\eta = B_1^T x_1 + A \nu$$ where $x_1$ is a vector (usually just a 1 for an intercept), $x_2$ is another vector of explanatory variables, and $\nu=C^T x_2$ is an $R$-vector of latent variables. Here, $\eta$ is a vector of linear predictors, e.g., the $m$th element is $\eta_m = \log(E[Y_m])$ for the $m$th Poisson response. The matrices $B_1$, $A$ and $C$ are estimated from the data, i.e., contain the regression coefficients. For ecologists, the central formula represents a constrained linear ordination (CLO) since it is linear in the latent variables. It means that the response is a monotonically increasing or decreasing function of the latent variables.

The underlying algorithm of RR-VGLMs is iteratively reweighted least squares (IRLS) with an optimizing algorithm applied within each IRLS iteration (e.g., alternating algorithm).

In theory, any VGAM family function that works for vglm and vgam should work for rrvglm too.

rrvglm.fit is the function that actually does the work. It is vglm.fit with some extra code.

References

Yee, T. W. and Hastie, T. J. (2003) Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15--41.

Yee, T. W. (2004) A new technique for maximum-likelihood canonical Gaussian ordination. Ecological Monographs, 74, 685--701.

Anderson, J. A. (1984) Regression and ordered categorical variables. Journal of the Royal Statistical Society, Series B, Methodological, 46, 1--30.

Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.

Examples

Run this code

data(car.all)
index = with(car.all, Country == "Germany" | Country == "USA" |
                      Country == "Japan" | Country == "Korea")
scar = car.all[index, ]  # standardized car data
fcols = c(13,14,18:20,22:26,29:31,33,34,36)  # These are factors
scar[,-fcols] = scale(scar[,-fcols]) # Standardize all numerical vars
ones = matrix(1, 3, 1)
cms = list("(Intercept)"=diag(3), Width=ones, Weight=ones,
           Disp.=diag(3), Tank=diag(3), Price=diag(3), 
           Frt.Leg.Room=diag(3))
set.seed(111)
fit = rrvglm(Country ~ Width + Weight + Disp. + Tank + Price + Frt.Leg.Room,
             multinomial, data =  scar, Rank = 2, trace = TRUE,
             constraints=cms, Norrr = ~ 1 + Width + Weight,
             Uncor=TRUE, Corner=FALSE, Bestof=2)
fit@misc$deviance  # A history of the fits
Coef(fit)
biplot(fit, chull=TRUE, scores=TRUE, clty=2, ccol="blue", scol="red",
       Ccol="darkgreen", Clwd=2, Ccex=2,
       main="1=Germany, 2=Japan, 3=Korea, 4=USA")

Run the code above in your browser using DataLab