Algorithmic constants and parameters for a constrained additive
ordination (CAO), by fitting a reduced-rank vector generalized
additive model (RR-VGAM), are set using this function.
This is the control function for cao
.
cao.control(Rank = 1, all.knots = FALSE, criterion = "deviance", Cinit = NULL,
Crow1positive = TRUE, epsilon = 1.0e-05, Etamat.colmax = 10,
GradientFunction = FALSE, iKvector = 0.1, iShape = 0.1,
noRRR = ~ 1, Norrr = NA,
SmallNo = 5.0e-13, Use.Init.Poisson.QO = TRUE,
Bestof = if (length(Cinit)) 1 else 10, maxitl = 10,
imethod = 1, bf.epsilon = 1.0e-7, bf.maxit = 10,
Maxit.optim = 250, optim.maxit = 20, sd.sitescores = 1.0,
sd.Cinit = 0.02, suppress.warnings = TRUE,
trace = TRUE, df1.nl = 2.5, df2.nl = 2.5,
spar1 = 0, spar2 = 0, ...)
A list with the components corresponding to its arguments, after some basic error checking.
The numerical rank \(R\) of the model, i.e., the number
of latent variables. Currently only Rank = 1
is implemented.
Logical indicating if all distinct points of the smoothing
variables are to be used as knots. Assigning the value
FALSE
means fewer knots are chosen when the number
of distinct points is large, meaning less computational
expense. See vgam.control
for details.
Convergence criterion. Currently, only one is supported: the deviance is minimized.
Optional initial C matrix which may speed up convergence.
Logical vector of length Rank
(recycled if
necessary): are the elements of the first row of
C positive? For example, if Rank
is 4,
then specifying Crow1positive = c(FALSE, TRUE)
will force C[1,1] and C[1,3] to be negative,
and C[1,2] and C[1,4] to be positive.
Positive numeric. Used to test for convergence for GLMs fitted in FORTRAN. Larger values mean a loosening of the convergence criterion.
Positive integer, no smaller than Rank
. Controls
the amount of memory used by .Init.Poisson.QO()
.
It is the maximum number of columns allowed for the
pseudo-response and its weights. In general, the larger
the value, the better the initial value. Used only if
Use.Init.Poisson.QO = TRUE
.
Logical. Whether optim
's argument
gr
is used or not, i.e., to compute gradient values.
Used only if FastAlgorithm
is TRUE
. Currently,
this argument must be set to FALSE
.
See qrrvglm.control
.
Formula giving terms that are not to be included
in the reduced-rank regression (or formation of the latent
variables). The default is to omit the intercept term from
the latent variables. Currently, only noRRR = ~ 1
is implemented.
Defunct. Please use noRRR
.
Use of Norrr
will become an error soon.
Positive numeric between .Machine$double.eps
and
0.0001
. Used to avoid under- or over-flow in the
IRLS algorithm.
Logical. If TRUE
then the function
.Init.Poisson.QO
is used to obtain initial values
for the canonical coefficients C. If FALSE
then random numbers are used instead.
Integer. The best of Bestof
models fitted is
returned. This argument helps guard against local solutions
by (hopefully) finding the global solution from many
fits. The argument works only when the function generates
its own initial value for C, i.e., when C
are not passed in as initial values. The default
is only a convenient minimal number and users are urged
to increase this value.
Positive integer. Maximum number of Newton-Raphson/Fisher-scoring/local-scoring iterations allowed.
See qrrvglm.control
.
Positive numeric. Tolerance used by the modified vector backfitting algorithm for testing convergence.
Positive integer. Number of backfitting iterations allowed in the compiled code.
Positive integer.
Number of iterations given to the function
optim
at each of the optim.maxit
iterations.
Positive integer.
Number of times optim
is invoked.
Numeric. Standard deviation of the
initial values of the site scores, which are generated from
a normal distribution.
Used when Use.Init.Poisson.QO
is FALSE
.
Standard deviation of the initial values for the elements
of C.
These are normally distributed with mean zero.
This argument is used only if Use.Init.Poisson.QO = FALSE
.
Logical. Suppress warnings?
Logical indicating if output should be produced for each
iteration. Having the value TRUE
is a good idea
for large data sets.
Numeric and non-negative, recycled to length S.
Nonlinear degrees
of freedom for smooths of the first and second latent variables.
A value of 0 means the smooth is linear. Roughly, a value between
1.0 and 2.0 often has the approximate flexibility of a quadratic.
The user should not assign too large a value to this argument, e.g.,
the value 4.0 is probably too high. The argument df1.nl
is
ignored if spar1
is assigned a positive value or values. Ditto
for df2.nl
.
Numeric and non-negative, recycled to length S.
Smoothing parameters of the
smooths of the first and second latent variables. The larger
the value, the more smooth (less wiggly) the fitted curves.
These arguments are an
alternative to specifying df1.nl
and df2.nl
.
A value 0 (the default) for spar1
means that
df1.nl
is used. Ditto for spar2
. The values
are on a scaled version of the latent variables. See Green
and Silverman (1994) for more information.
Ignored at present.
T. W. Yee
Many of these arguments are identical to
qrrvglm.control
. Here, \(R\) is the
Rank
, \(M\) is the number of additive predictors, and
\(S\) is the number of responses (species). Thus \(M=S\)
for binomial and Poisson responses, and \(M=2S\) for the
negative binomial and 2-parameter gamma distributions.
Allowing the smooths too much flexibility means the CAO
optimization problem becomes more difficult to solve. This
is because the number of local solutions increases as
the nonlinearity of the smooths increases. In situations
of high nonlinearity, many initial values should be used,
so that Bestof
should be assigned a larger value. In
general, there should be a reasonable value of df1.nl
somewhere between 0 and about 3 for most data sets.
Yee, T. W. (2006). Constrained additive ordination. Ecology, 87, 203--213.
Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, London: Chapman & Hall.
cao
.
if (FALSE) {
hspider[,1:6] <- scale(hspider[,1:6]) # Standardized environmental vars
set.seed(123)
ap1 <- cao(cbind(Pardlugu, Pardmont, Pardnigr, Pardpull, Zoraspin) ~
WaterCon + BareSand + FallTwig +
CoveMoss + CoveHerb + ReflLux,
family = poissonff, data = hspider,
df1.nl = c(Zoraspin = 2.3, 2.1),
Bestof = 10, Crow1positive = FALSE)
sort(deviance(ap1, history = TRUE)) # A history of all the iterations
Coef(ap1)
par(mfrow = c(2, 3)) # All or most of the curves are unimodal; some are
plot(ap1, lcol = "blue") # quite symmetric. Hence a CQO model should be ok
par(mfrow = c(1, 1), las = 1)
index <- 1:ncol(depvar(ap1)) # lvplot is jagged because only 28 sites
lvplot(ap1, lcol = index, pcol = index, y = TRUE)
trplot(ap1, label = TRUE, col = index)
abline(a = 0, b = 1, lty = 2)
persp(ap1, label = TRUE, col = 1:4)
}
Run the code above in your browser using DataLab