
binomialff(link = "logit", dispersion = 1, mv = FALSE,
onedpar = !mv, parallel = FALSE,
zero = NULL, bred = FALSE, earg.link = FALSE)
Links
and
CommonVGAMffArguments
for more information.dispersion = 0
to have it estimated, or else specify a known
positive value (or values if mv
TRUE
, then the response is interpreted
as $M$ independent binary responses, where $M$ is the number
of columns of the response matrix. In this case, the response matrix
should have zero/one values only.If
mv
, then a separate dispersion
parameter will be computed for each response (column), by default.
Setting onedpar = TRUE
will pool them so that there is only one
dispersion parameter to be estimmv
is TRUE
. This
argument allows for the parallelism assumption whereby the regression
coefficients for a variable is constrained to be equal over the $M$
linear/additive predictors.
If
CommonVGAMffArguments
.CommonVGAMffArguments
.
Setting bred = TRUE
should work for
multiple responses (mv = TRUE
) and
all
See the above note regarding bred
.
The maximum likelihood estimate will not exist if the data is
completely separable or quasi-completely separable.
See Chapter 10 of Altman et al. (2004) for more details,
and sepcheck = TRUE
, say, argument to detect this
problem and give an appropriate warning.
binomial
,
however there are some differences.If the dispersion parameter is unknown, then the resulting estimate is not fully a maximum likelihood estimate (see pp.124--8 of McCullagh and Nelder, 1989).
A dispersion parameter that is less/greater than unity corresponds to under-/over-dispersion relative to the binomial model. Over-dispersion is more common in practice.
Setting mv = TRUE
is necessary when fitting a Quadratic RR-VGLM
(see cqo
) because the response is a matrix of $M$
columns (e.g., one column per species). Then there will be $M$
dispersion parameters (one per column of the response matrix).
When used with cqo
and cao
, it may be
preferable to use the cloglog
link.
Altman, M. and Gill, J. and McDonald, M. P. (2004) Numerical Issues in Statistical Computing for the Social Scientist, Hoboken, NJ, USA: Wiley-Interscience.
Ridout, M. S. (1990) Non-convergence of Fisher's method of scoring---a simple example. GLIM Newsletter, 20(6).
quasibinomialff
,
Links
,
rrvglm
,
cqo
,
cao
,
betabinomial
,
posbinomial
,
zibinomial
,
double.expbinomial
,
matched.binomial
,
seq2binomial
,
amlbinomial
,
simplex
,
binomial
,
simulate.vlm
,
quasibinomialff()
quasibinomialff(link = "probit")
shunua <- hunua[sort.list(with(hunua, altitude)), ] # Sort by altitude
fit <- vglm(agaaus ~ poly(altitude, 2), binomialff(link = cloglog), shunua)
plot(agaaus ~ jitter(altitude), shunua, col = "blue", ylab = "P(Agaaus = 1)",
main = "Presence/absence of Agathis australis", las = 1)
with(shunua, lines(altitude, fitted(fit), col = "orange", lwd = 2))
# Fit two species simultaneously
fit2 <- vgam(cbind(agaaus, kniexc) ~ s(altitude), binomialff(mv = TRUE), shunua)
with(shunua, matplot(altitude, fitted(fit2), type = "l",
main = "Two species response curves", las = 1))
# Shows that Fisher scoring can sometime fail. See Ridout (1990).
ridout <- data.frame(v = c(1000, 100, 10), r = c(4, 3, 3), n = c(5, 5, 5))
(ridout <- transform(ridout, logv = log(v)))
# The iterations oscillates between two local solutions:
glm.fail <- glm(r / n ~ offset(logv) + 1, weight = n,
binomial(link = 'cloglog'), ridout, trace = TRUE)
coef(glm.fail)
# vglm()'s half-stepping ensures the MLE of -5.4007 is obtained:
vglm.ok <- vglm(cbind(r, n-r) ~ offset(logv) + 1,
binomialff(link = cloglog), ridout, trace = TRUE)
coef(vglm.ok)
# Separable data
set.seed(123)
threshold <- 0
bdata <- data.frame(x2 = sort(rnorm(nn <- 100)))
bdata <- transform(bdata, y1 = ifelse(x2 < threshold, 0, 1))
fit <- vglm(y1 ~ x2, binomialff(bred = TRUE),
data = bdata, criter = "coef", trace = TRUE)
coef(fit, matrix = TRUE) # Finite!!
summary(fit)
plot(depvar(fit) ~ x2, data = bdata, col = "blue", las = 1)
lines(fitted(fit) ~ x2, data = bdata, col = "orange")
abline(v = threshold, col = "gray", lty = "dashed")
Run the code above in your browser using DataLab