ordistep: Choose a Model by Permutation Tests in Constrained Ordination

Description

Automatic stepwise model building for constrained ordination methods (cca, rda, dbrda, capscale). The function ordistep is modelled after step and can do forward, backward and stepwise model selection using permutation tests. Function ordiR2step performs forward model choice solely on adjusted \(R^2\) and \(P\)-value.

Usage

ordistep(object, scope, direction = c("both", "backward", "forward"),
   Pin = 0.05, Pout = 0.1, permutations = how(nperm = 199), steps = 50,
   trace = TRUE, ...)
ordiR2step(object, scope, Pin = 0.05, R2scope = TRUE,
   permutations = how(nperm = 499), trace = TRUE, R2permutations = 1000, ...)

Value

Functions return the selected model with one additional component, anova, which contains brief information of steps taken. You can suppress voluminous output during model building by setting trace = FALSE, and find the summary of model history in the anova item.

Arguments

object: In ordistep, an ordination object inheriting from cca or rda.
scope: Defines the range of models examined in the stepwise search. This can be a list containing components upper and lower, both formulae. If it is a single item, it is interpreted the target scope, depending on the direction. If direction is "forward", a single item is interpreted as the upper scope and the formula of the input object as the lower scope. See step for details. In ordiR2step, this defines the upper scope; it can also be an ordination object from with the model is extracted.
direction: The mode of stepwise search, can be one of "both", "backward", or "forward", with a default of "both". If the scope argument is missing, the default for direction is "backward" in ordistep (and ordiR2step does not have this argument, but only works forward).
Pin, Pout: Limits of permutation \(P\)-values for adding (Pin) a term to the model, or dropping (Pout) from the model. Term is added if \(P \le\) Pin, and removed if \(P >\) Pout.
R2scope: Use adjusted \(R^2\) as the stopping criterion: only models with lower adjusted \(R^2\) than scope are accepted.
permutations: a list of control values for the permutations as returned by the function how, or the number of permutations required, or a permutation matrix where each row gives the permuted indices. This is passed to anova.cca: see there for details.
steps: Maximum number of iteration steps of dropping and adding terms.
trace: If positive, information is printed during the model building. Larger values may give more information.
R2permutations: Number of permutations used in the estimation of adjusted \(R^2\) for cca using RsquareAdj.
...: Any additional arguments to add1.cca and drop1.cca.

Author

Jari Oksanen

Details

The basic functions for model choice in constrained ordination are add1.cca and drop1.cca. With these functions, ordination models can be chosen with standard R function step which bases the term choice on AIC. AIC-like statistics for ordination are provided by functions deviance.cca and extractAIC.cca (with similar functions for rda). Actually, constrained ordination methods do not have AIC, and therefore the step may not be trusted. This function provides an alternative using permutation \(P\)-values.

Function ordistep defines the model, scope of models considered, and direction of the procedure similarly as step. The function alternates with drop and add steps and stops when the model was not changed during one step. The - and + signs in the summary table indicate which stage is performed. It is often sensible to have Pout \(>\) Pin in stepwise models to avoid cyclic adds and drops of single terms.

Function ordiR2step builds model forward so that it maximizes adjusted \(R^2\) (function RsquareAdj) at every step, and stopping when the adjusted \(R^2\) starts to decrease, or the adjusted \(R^2\) of the scope is exceeded, or the selected permutation \(P\)-value is exceeded (Blanchet et al. 2008). The second criterion is ignored with option R2scope = FALSE, and the third criterion can be ignored setting Pin = 1 (or higher). The function cannot be used if adjusted \(R^2\) cannot be calculated. If the number of predictors is higher than the number of observations, adjusted \(R^2\) is also unavailable. Such models can be analysed with R2scope = FALSE, but the variable selection will stop if models become overfitted and adjusted \(R^2\) cannot be calculated, and the adjusted \(R^2\) will be reported as zero. The \(R^2\) of cca is based on simulations (see RsquareAdj) and different runs of ordiR2step can give different results.

Functions ordistep (based on \(P\) values) and ordiR2step (based on adjusted \(R^2\) and hence on eigenvalues) can select variables in different order.

References

Blanchet, F. G., Legendre, P. & Borcard, D. (2008) Forward selection of explanatory variables. Ecology 89, 2623--2632.

Examples

Run this code

## See add1.cca for another example

### Dune data
data(dune)
data(dune.env)
mod0 <- rda(dune ~ 1, dune.env)  # Model with intercept only
mod1 <- rda(dune ~ ., dune.env)  # Model with all explanatory variables

## With scope present, the default direction is "both"
mod <- ordistep(mod0, scope = formula(mod1))
mod
## summary table of steps
mod$anova

## Example of ordistep, forward
ordistep(mod0, scope = formula(mod1), direction="forward")

## Example of ordiR2step (always forward)
## stops because R2 of 'mod1' exceeded
ordiR2step(mod0, mod1)

Run the code above in your browser using DataLab