predict
can be used to find site and species scores or
estimates of the response data with new data sets, Function
calibrate
estimates values of constraints with new data set.
Functions fitted
and residuals
return estimates of
response data.
"fitted"(object, model = c("CCA", "CA", "pCCA"), type = c("response", "working"), ...)
"fitted"(object, model = c("CCA", "CA", "pCCA", "Imaginary"), type = c("response", "working"), ...)
"residuals"(object, ...)
"predict"(object, newdata, type = c("response", "wa", "sp", "lc", "working"), rank = "full", model = c("CCA", "CA"), scaling = "none", hill = FALSE, ...)
"predict"(object, newdata, type = c("response", "wa", "sp", "lc", "working"), rank = "full", model = c("CCA", "CA"), scaling = "none", correlation = FALSE, ...)
"calibrate"(object, newdata, rank = "full", ...)
"coef"(object, ...)
"predict"(object, newdata, type = c("response", "sites", "species"), rank = 4, ...)
"CCA"
), unconstrained
("CA"
) or conditioned partial ("pCCA"
)
results. For fitted
method of capscale
this
can also be "Imaginary"
for imaginary components with
negative eigenvalues. type = "lc"
and for constrained component with type =
"response"
and type = "working"
it must be a data frame of
constraints. The newdata
must have the same number of rows
as the original community data for a cca
result with
type = "response"
or type = "working"
. If the
original model had row or column names, then new data must contain
rows or columns with the same names (row names for species scores,
column names for "wa"
scores and constraint names of
"lc"
scores). In other cases the rows or columns must match
directly. "response"
scales results so that the same ordination gives
the same results, and "working"
gives the values used
internally, that is after Chi-square standardization in
cca
and scaling and centring in
rda
. In capscale
the "response"
gives the dissimilarities, and "working"
the scaled scores
that produce the dissimilarities as Euclidean
distances. Alternative "wa"
gives the site scores as
weighted averages of the community data, "lc"
the site
scores as linear combinations of environmental data, and
"sp"
the species scores. In predict.decorana
the
alternatives are scores for "sites"
or "species"
."model"
or
all available four axes in predict.decorana
.capscale
and CCA
respectively. See scores.cca
for additional details. Function fitted
gives the approximation of the original data
matrix or dissimilarities from the ordination result either in the
scale of the response or as scaled internally by the function.
Function residuals
gives the approximation of the original
data from the unconstrained ordination. With argument type =
"response"
the fitted.cca
and residuals.cca
function
both give the same marginal totals as the original data matrix, and
fitted and residuals do not add up to the original data. Functions
fitted.capscale
and residuals.capscale
give the
dissimilarities with type = "response"
, but these are not
additive, but the "working"
scores are additive. All
variants of fitted
and residuals
are defined so that
for model mod <- cca(y ~ x)
, cca(fitted(mod))
is equal
to constrained ordination, and cca(residuals(mod))
is equal
to unconstrained part of the ordination.
Function predict
can find the estimate of the original data
matrix or dissimilarities (type = "response"
) with any rank.
With rank = "full"
it is identical to fitted
. In
addition, the function can find the species scores or site scores from
the community data matrix for cca
or rda
.
The function can be used with new data, and it can be used to add new
species or site scores to existing ordinations. The function returns
(weighted) orthonormal scores by default, and you must specify
explicit scaling
to add those scores to ordination
diagrams. With type = "wa"
the function finds the site scores
from species scores. In that case, the new data can contain new sites,
but species must match in the original and new data. With type="sp"
the function finds species scores from site constraints
(linear combination scores). In that case the new data can contain new
species, but sites must match in the original and new data. With
type = "lc"
the function finds the linear combination scores
for sites from environmental data. In that case the new data frame
must contain all constraining and conditioning environmental variables
of the model formula. With type = "response"
or
type = "working"
the new data must contain environmental variables
if constrained component is desired, and community data matrix if
residual or unconstrained component is desired. With these types, the
function uses newdata
to find new "lc"
(constrained) or
"wa"
scores (unconstrained) and then finds the response or
working data from these new row scores and species scores. The
original site (row) and species (column) weights are used for
type = "response"
and type = "working"
in correspondence
analysis (cca
) and therefore the number of rows must
match in the original data and newdata
.
If a completely new data frame is created, extreme care is needed
defining variables similarly as in the original model, in particular
with (ordered) factors. If ordination was performed with the formula
interface, the newdata
can be a data frame or matrix, but
extreme care is needed that the columns match in the original and
newdata
.
Function calibrate.cca
finds estimates of constraints from
community ordination or "wa"
scores from cca
,
rda
and capscale
. This is often known as
calibration, bioindication or environmental reconstruction.
Basically, the method is similar to projecting site scores onto biplot
arrows, but it uses regression coefficients. The function can be called
with newdata
so that cross-validation is possible. The
newdata
may contain new sites, but species must match in the
original and new data. The function
does not work with partial models with Condition
term,
and it cannot be used with newdata
for capscale
results. The results may only be interpretable for continuous variables.
Function coef
will give the regression coefficients from centred
environmental variables (constraints and conditions) to linear
combination scores. The coefficients are for unstandardized environmental
variables. The coefficients will be NA
for aliased effects.
Function predict.decorana
is similar to predict.cca
.
However, type = "species"
is not available in detrended
correspondence analysis (DCA), because detrending destroys the mutual
reciprocal averaging (except for the first axis when rescaling is not
used). Detrended CA does not attempt to approximate the original data
matrix, so type = "response"
has no meaning in detrended
analysis (except with rank = 1
).
cca
, rda
, capscale
,
decorana
, vif
, goodness.cca
. data(dune)
data(dune.env)
mod <- cca(dune ~ A1 + Management + Condition(Moisture), data=dune.env)
# Definition of the concepts 'fitted' and 'residuals'
mod
cca(fitted(mod))
cca(residuals(mod))
# Remove rare species (freq==1) from 'cca' and find their scores
# 'passively'.
freq <- specnumber(dune, MARGIN=2)
freq
mod <- cca(dune[, freq>1] ~ A1 + Management + Condition(Moisture), dune.env)
predict(mod, type="sp", newdata=dune[, freq==1], scaling="species")
# New sites
predict(mod, type="lc", new=data.frame(A1 = 3, Management="NM", Moisture="2"), scal=2)
# Calibration and residual plot
mod <- cca(dune ~ A1 + Moisture, dune.env)
pred <- calibrate(mod)
pred
with(dune.env, plot(A1, pred[,"A1"] - A1, ylab="Prediction Error"))
abline(h=0)
Run the code above in your browser using DataLab