Learn R Programming

VGAM (version 0.9-2)

binom2.or: Bivariate Binary Regression with an Odds Ratio (Family Function)

Description

Fits a Palmgren (bivariate odds-ratio model, or bivariate logistic regression) model to two binary responses. Actually, a bivariate logistic/probit/cloglog/cauchit model can be fitted. The odds ratio is used as a measure of dependency.

Usage

binom2.or(lmu = "logit", lmu1 = lmu, lmu2 = lmu, loratio = "loge",
          imu1 = NULL, imu2 = NULL, ioratio = NULL, zero = 3,
          exchangeable = FALSE, tol = 0.001, morerobust = FALSE)

Arguments

lmu
Link function applied to the two marginal probabilities. See Links for more choices. See the note below.
lmu1, lmu2
Link function applied to the first and second of the two marginal probabilities.
loratio
Link function applied to the odds ratio. See Links for more choices.
imu1, imu2, ioratio
Optional initial values for the marginal probabilities and odds ratio. See CommonVGAMffArguments for more details. In general good initial values are often required so use these argument
zero
Which linear/additive predictor is modelled as an intercept only? A NULL means none.
exchangeable
Logical. If TRUE, the two marginal probabilities are constrained to be equal.
tol
Tolerance for testing independence. Should be some small positive numerical value.
morerobust
Logical. If TRUE then some measures are taken to compute the derivatives and working weights more robustly, i.e., in an attempt to avoid numerical problems. Currently this feature is not debugged if set TRUE.

Value

  • An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm and vgam.

    When fitted, the fitted.values slot of the object contains the four joint probabilities, labelled as $(Y_1,Y_2)$ = (0,0), (0,1), (1,0), (1,1), respectively. These estimated probabilities should be extracted with the fitted generic function.

Details

Also known informally as the Palmgren model, the bivariate logistic model is a full-likelihood based model defined as two logistic regressions plus log(oratio) = eta3 where eta3 is the third linear/additive predictor relating the odds ratio to explanatory variables. Explicitly, the default model is $$logit[P(Y_j=1)] = \eta_j,\ \ \ j=1,2$$ for the marginals, and $$\log[P(Y_{00}=1) P(Y_{11}=1) / (P(Y_{01}=1) P(Y_{10}=1))] = \eta_3,$$ specifies the dependency between the two responses. Here, the responses equal 1 for a success and a 0 for a failure, and the odds ratio is often written $\psi=p_{00}p_{11}/(p_{10}p_{01})$. The model is fitted by maximum likelihood estimation since the full likelihood is specified. The two binary responses are independent if and only if the odds ratio is unity, or equivalently, the log odds ratio is zero. Fisher scoring is implemented.

The default models $\eta_3$ as a single parameter only, i.e., an intercept-only model, but this can be circumvented by setting zero = NULL in order to model the odds ratio as a function of all the explanatory variables. The function binom2.or() can handle other probability link functions such as probit, cloglog and cauchit links as well, so is quite general. In fact, the two marginal probabilities can each have a different link function. A similar model is the bivariate probit model (binom2.rho), which is based on a standard bivariate normal distribution, but the bivariate probit model is less interpretable and flexible.

The exchangeable argument should be used when the error structure is exchangeable, e.g., with eyes or ears data.

References

McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.

le Cessie, S. and van Houwelingen, J. C. (1994) Logistic regression for correlated binary data. Applied Statistics, 43, 95--108.

Palmgren, J. (1989) Regression Models for Bivariate Binary Responses. Technical Report no. 101, Department of Biostatistics, University of Washington, Seattle.

Yee, T. W. and Dirnbock, T. (2009) Models for analysing species' presence/absence data at two time points. Journal of Theoretical Biology, 259(4), 684--694.

Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.

See Also

rbinom2.or, binom2.rho, loglinb2, zipebcom, coalminers, binomialff, logit, probit, cloglog, cauchit.

Examples

Run this code
# Fit the model in Table 6.7 in McCullagh and Nelder (1989)
coalminers <- transform(coalminers, Age = (age - 42) / 5)
fit <- vglm(cbind(nBnW, nBW, BnW, BW) ~ Age, binom2.or(zero = NULL), coalminers)
fitted(fit)
summary(fit)
coef(fit, matrix = TRUE)
c(weights(fit, type = "prior")) * fitted(fit)  # Table 6.8

with(coalminers, matplot(Age, fitted(fit), type = "l", las = 1,
                         xlab = "(age - 42) / 5", lwd = 2))
with(coalminers, matpoints(Age, depvar(fit), col=1:4))
legend(x = -4, y = 0.5, lty = 1:4, col = 1:4, lwd = 2,
       legend = c("1 = (Breathlessness=0, Wheeze=0)",
                  "2 = (Breathlessness=0, Wheeze=1)",
                  "3 = (Breathlessness=1, Wheeze=0)",
                  "4 = (Breathlessness=1, Wheeze=1)"))


# Another model: pet ownership
require(VGAMdata)
# More homogeneous:
petdata <- subset(xs.nz, ethnic == "0" & age < 70 & sex == "M")
petdata <- na.omit(petdata[, c("cat", "dog", "age")])
summary(petdata)
with(petdata, table(cat, dog))  # Can compute the odds ratio

fit <- vgam(cbind((1-cat) * (1-dog), (1-cat) * dog,
                     cat  * (1-dog),    cat  * dog) ~ s(age, df = 5),
            binom2.or(zero =    3), data = petdata, trace = TRUE)
colSums(depvar(fit))
coef(fit, matrix = TRUE)

# Plot the estimated probabilities
ooo <- order(with(petdata, age))
matplot(with(petdata, age)[ooo], fitted(fit)[ooo, ], type = "l",
        xlab = "Age", ylab = "Probability", main = "Pet ownership",
        ylim = c(0, max(fitted(fit))), las = 1, lwd = 1.5)
legend("topleft", col=1:4, lty = 1:4, leg = c("no cat or dog ",
       "dog only", "cat only", "cat and dog"), lwd = 1.5)

Run the code above in your browser using DataLab