Learn R Programming

VGAM (version 0.8-2)

cumulative: Ordinal Regression with Cumulative Probabilities

Description

Fits a cumulative link regression model to a (preferably ordered) factor response.

Usage

cumulative(link = "logit", earg = list(), parallel = FALSE,
           reverse = FALSE, mv = FALSE, intercept.apply = FALSE)

Arguments

link
Link function applied to the $J$ cumulative probabilities. See Links for more choices, e.g., for the cumulative probit/
earg
List. Extra argument for the link function. See earg in Links for general information.
parallel
A logical or formula specifying which terms have equal/unequal coefficients. See below for more information about the parallelism assumption.
reverse
Logical. By default, the cumulative probabilities used are $P(Y\leq 1)$, $P(Y\leq 2)$, ..., $P(Y\leq J)$. If reverse is TRUE then $P(Y\geq 2)$, $P(Y\geq 3)$, ..., $P(Y\geq J+1)$ are used.

This should be set to

mv
Logical. Multivariate response? If TRUE then the input should be a matrix with values $1,2,\dots,L$, where $L=J+1$ is the number of levels. Each column of the matrix is a response, i.e., multivariate response. A suitable matrix can
intercept.apply
Logical. Whether the parallel argument should be applied to the intercept term. This should be set to TRUE for link= golf, polf

Value

  • An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Warning

No check is made to verify that the response is ordinal; see ordered.

Details

In this help file the response $Y$ is assumed to be a factor with ordered values $1,2,\dots,J+1$. Hence $M$ is the number of linear/additive predictors $\eta_j$; for cumulative() one has $M=J$.

This VGAM family function fits the class of cumulative link models to (hopefully) an ordinal response. By default, the non-parallel cumulative logit model is fitted, i.e., $$\eta_j = logit(P[Y \leq j])$$ where $j=1,2,\dots,M$ and the $\eta_j$ are not constrained to be parallel. This is also known as the non-proportional odds model. If the logit link is replaced by a complementary log-log link (cloglog) then this is known as the proportional-hazards model.

In almost all the literature, the constraint matrices associated with this family of models are known. For example, setting parallel = TRUE will make all constraint matrices (except for the intercept) equal to a vector of $M$ 1's. If the constraint matrices are equal, unknown and to be estimated, then this can be achieved by fitting the model as a reduced-rank vector generalized linear model (RR-VGLM; see rrvglm). Currently, reduced-rank vector generalized additive models (RR-VGAMs) have not been implemented here.

References

Agresti, A. (2002) Categorical Data Analysis, 2nd ed. New York: Wiley.

Agresti, A. (2010) Analysis of Ordinal Categorical Data, 2nd ed. New York: Wiley.

Dobson, A. J. and Barnett, A. (2008) An Introduction to Generalized Linear Models, 3rd ed. Boca Raton: Chapman & Hall/CRC Press.

McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.

Simonoff, J. S. (2003) Analyzing Categorical Data, New York: Springer-Verlag.

Yee, T. W. (2010) The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1--34. http://www.jstatsoft.org/v32/i10/.

Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481--493.

Further information and examples on categorical data analysis by the VGAM package can be found at http://www.stat.auckland.ac.nz/~yee/VGAM/doc/categorical.pdf.

See Also

propodds, prplot, margeff, acat, cratio, sratio, multinomial, pneumo, Links, logit, probit, cloglog, cauchit, golf, polf, nbolf, logistic1.

Examples

Run this code
# Fit the proportional odds model, p.179, in McCullagh and Nelder (1989)
pneumo = transform(pneumo, let = log(exposure.time))
(fit = vglm(cbind(normal, mild, severe) ~ let,
            cumulative(parallel = TRUE, reverse = TRUE), pneumo))
fit@y   # Sample proportions
weights(fit, type = "prior")   # Number of observations
coef(fit, matrix = TRUE)
constraints(fit)   # Constraint matrices

# Check that the model is linear in let ----------------------
fit2 = vgam(cbind(normal, mild, severe) ~ s(let, df = 2),
            cumulative(reverse = TRUE), pneumo)
plot(fit2, se = TRUE, overlay = TRUE, lcol = 1:2, scol = 1:2)

# Check the proportional odds assumption with a LRT ----------
(fit3 = vglm(cbind(normal, mild, severe) ~ let,
             cumulative(parallel=FALSE, reverse=TRUE), pneumo))
pchisq(2*(logLik(fit3)-logLik(fit)),
       df = length(coef(fit3))-length(coef(fit)), lower.tail = FALSE)

# A factor() version of fit ----------------------------------
# This is in long format (cf. wide format above)
nobs = round(fit@y * c(weights(fit, type = "prior")))
sumnobs = colSums(nobs) # apply(nobs, 2, sum)

pneumo.long = data.frame(symptoms = ordered(rep(rep(colnames(nobs),
                                  nrow(nobs)),
                                  times = c(t(nobs))),
                                  levels = colnames(nobs)),
                         let = rep(rep(with(pneumo, let), each = ncol(nobs)),
                                   times = c(t(nobs))))
with(pneumo.long, table(let, symptoms)) # check it; should be same as pneumo


(fit.long1 = vglm(symptoms ~ let, data = pneumo.long,
             cumulative(parallel=TRUE, reverse=TRUE), trace=TRUE))
coef(fit.long1, matrix=TRUE) # Should be same as coef(fit, matrix=TRUE)
# Could try using mustart if fit.long1 failed to converge.
mymustart = matrix(sumnobs / sum(sumnobs),
                   nrow(pneumo.long), ncol(nobs), byrow = TRUE)
fit.long2 = vglm(symptoms ~ let,
                fam = cumulative(parallel = TRUE, reverse = TRUE),
                mustart = mymustart, data = pneumo.long, trace = TRUE)
coef(fit.long2, matrix = TRUE) # Should be same as coef(fit, matrix = TRUE)

Run the code above in your browser using DataLab