alr: Alternating Logistic Regression for multivariate binary data @(#) chanlib revisions: alr.d alr.d 4.2

Usage

alr(formula = formula(data), id = id, weights = NULL, data
                 = sys.parent(), subset, na.action, contrasts = NULL, z
                 = 0, zmast = 0, zid = 0, zlocs = 0, binit = NULL,
                 ainit, bweight = "full", depmodel = "general",
                 subclust = 0, clnames = NULL, tol = 0.001, maxiter =
                 25, silent = TRUE)

Arguments

formula

typical model formula instance for binary logistic regression

cluster discriminator: N-vector

weights

vector of observation-specific weights

data

data frame for bindings of variables in formula

subset

typical subset expression

na.action

missing data handler

contrasts

optional list as in model.matrix.default contrasts.arg

matrix of predictors for pairwise log odds ratio regression. this may be omitted for certain choices of "depmodel", see below. if used, this matrix may take one of two forms. it may specify directly, for each cluster, the form of the dependency design. in this case, it has dimension (sum_i(n_i*(n_i - 1))) x q. on the other hand, if the data to be analyzed are replicated (possibly subject to missingness), then if n is the size of a complete cluster, z has dimension (n*(n-1)) x q. see zmast and zlocs below

zmast

if 1, then z is a "master" design for pairwise log oddsratio regression. each cluster is a replicate of this master design, but clusters may have missing elements, thus missing pairs. the missing information is extracted from zlocs, below. if 0, then z has dimension (sum_i(n_i*(n_i-1))) x q

zid

cluster discriminators for the z matrix, used only if zmast == 0. if used, has dimension (sum_i(n_i *(n_i-1))) x 1.

zlocs

used only if zmast == 1. this is an N-vector of "locations" in the dependency design. if replication is perfect and complete, and n is the cluster size, and there are C clusters, then this vector should have the value rep(1:n,C).

binit

p-vector of initial values for beta -- required; should be obtained from a logistic regression fit assuming independence

ainit

q-vector of initial values for alpha -- difficult to recommend starting values here, but rep(.01,q) often works

tol

maximum relative change in parameter tolerated before asserting convergence

bweight

currently takes one of two values: "full" or "independence". if "independence", then the estimating equation for estimating regression parameters beta uses a diagonal weighting matrix, and the estimated beta should be identical to the GLIM estimate. if "full", then the estimating equation for beta uses the estimated covariance (nxn) matrix of the outcomes as weight. the "independence" setting may be useful for very large clusters (n>50?) with sparse outcomes, in which the weighting matrix can tend to singularity.

depmodel

currently takes one of three values: "general", "exchangeable", "1-nested". The "general" setting uses a Z matrix (see above) to specify the dependency model. The "exchangeable" setting uses no Z matrix, but assumes a common pairwise log-odds ratio for all cluster elements. The "1-nested" setting estimates a log-odds ratio regression with first parameter the log pairwise odds ratio for any two elements in a cluster, the second parameter is the incremental dependency between members of the same subcluster. subclusters are identified through the "subclust" vector; see below.

subclust

N-vector discriminating subclusters within clusters. Thus two 8-clusters might have id=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,2), subclust=c(10,10,11,11,11,11,11,11,20,20,20,20,21,21,21,21). the first cluster falls into 2 subclusters, the first is size 2, the second size 6; the second has 2 subclusters each of size 4. the 1-nested model says that the log-pairwise odds ratios between elements of the same cluster is a0+a1 if the two elements are ALSO of the same subcluster, and a0 if they are from different subclusters.

Value

a structure including estimates of beta, alpha, and the parameter covariance matrix

Examples

Run this code

data(alrset)
a1 <- alr(alr.y ~ alr.x - 1, id=alr.id, depm="exchangeable", ainit=0.01)
summary(a1)

#using a master z matrix for a balanced design
ZMAST <- rep(1,12)
ZMAST <- cbind(ZMAST,c(0,0,0,0,1,1,0,1,1,0,1,1))
ZIND <- rep(1:4,125)
Y <- as.vector(alr.y)
X <- as.vector(alr.x[,2])
NY <- split(alr.y,alr.id)
NY <- unlist(lapply(NY,function(x)rev(x)))
NX <- split(X,alr.id)
NX <- unlist(lapply(NX,function(x)rev(x)))
Y <- as.vector(NY)
X <- as.vector(NX)
mast.out <- alr(Y ~ X, id=alr.id, depmod="general", z=ZMAST,
zmast=1, zloc=ZIND,
               ainit = c(0.01,0.01) )
summary(mast.out)

Run the code above in your browser using DataLab