Fits a palaeoecological transfer function model using principal component regression, using an optional transformation of the matrix of predictor variables when these are species abundance data.
# S3 method for default
pcr(x, y, ncomp, tranFun, ...)# S3 method for formula
pcr(formula, data, subset, na.action, ..., model = FALSE)
Hellinger(x, ...)
ChiSquare(x, apply = FALSE, parms)
# S3 method for pcr
performance(object, ...)
# S3 method for pcr
residuals(object, comps = NULL, ...)
# S3 method for pcr
fitted(object, comps = NULL, ...)
# S3 method for pcr
coef(object, comps = NULL, ...)
# S3 method for pcr
screeplot(x, restrict = NULL,
display = c("RMSE","avgBias","maxBias","R2"),
xlab = NULL, ylab = NULL, main = NULL, sub = NULL, ...)
# S3 method for pcr
eigenvals(x, ...)
Matrix or data frame of predictor variables. Usually species
composition or abundance data for transfer function models. For
screeplot
and eigenvals
, an object of class
"pcr"
.
Numeric vector; the response variable to be modelled.
numeric; number of principal components to build models for. If not supplied the largest possible number of components is determined.
function; a function or name of a function that
performs a transformation of the predictor variables x
. The
function must be self-contained as no arguments are passed to the
function when it is applied. See Details for more information.
a model formula.
an optional data frame, list or environment (or object
coercible by as.data.frame
to a data frame) containing
the variables specified on the RHS of the model formula. If not found in
data
, the variables are taken from
environment(formula)
, typically the environment from which
pcr
is called.
an optional vector specifying a subset of observations to be used in the fitting process.
a function which indicates what should happen when
the data contain NA
s. The default is set by the
na.action
setting of options
, and is na.fail
if
that is unset. The 'factory-fresh' default is na.omit
.
Another possible value is NULL
, no action. Value
na.exclude
can be useful.
logical; If TRUE
the model frame is returned?
logical; should an existing tranformation, using pre-computed meta-parameters, be applied?
list; a named list of parameters computed during model fitting that can be used to apply the transformation during prediction.
an object of class "pcr"
.
numeric; which components to return.
numeric; limit the number of components on the screeplot.
character; which model performance statistic should be drawn on the screeplot?
character; labels for the plot.
Arguments passed to other methods.
Returns an object of class "pcr"
, a list with the
following components:
matrix; the PCR estimates of the response. The columns contain fitted values using C components, where C is the Cth column of the matrix.
matrix; regression coefficients for the
PCR. Columns as per fitted
above.
matrix; residuals, where the Cth column represents a PCR model using C components.
numeric; means of the predictor variables in the training data.
numeric; mean of the response variable in the training data.
numeric; variance explained by the PCR model. These are the squares of the singular values.
numeric; total variance in the training data
the matched call.
transformation function used. NA
if none
supplied/used.
list; meta parameters used to computed the transformed training data.
data frame; cross-validation performance statistics for the model.
numeric; number of principal components computed
When applying cross-validation (CV) to transfer function models, any transformation of the predictors must be applied separately during each iteration of the CV procedure to the part of the data used in fitting the model. In the same way, any samples to be predicted from the model must use any meta-parameters derived from the training data only. For examle, centring is appled to the training data only and the variables means used to centre the training data are used to centre the test samples. The variable means should not be computed on a combination of the training and test samples.
When using PCR, we might wish to apply a transformation to the species data predictor variables such that the PCA of those data preserves a dissimilarity coefficient other than the Euclidean distance. This transformation is applied to allow PCA to better describe patterns in the species data (Legendre & Gallagher 2001).
How this is handled in pcr
is to take a user-supplied function
that takes a single argument, the matrix of predictor variables. The
function should return a matrix of the same dimension as the input. If
any meta-parameters are required for subsequent use in prediction,
these should be returned as attribute "parms"
, attached to the
matrix.
Two example transformation functions are provided implementing the
Hellinger and Chi Square transformations of Legendre & Gallagher
(2001). Users can base their transformation functions on
these. ChiSquare()
illustrates how meta-parameters should be
returned as the attribute "parms"
.
# NOT RUN {
## Load the Imbrie & Kipp data and
## summer sea-surface temperatures
data(ImbrieKipp)
data(SumSST)
## normal interface and apply Hellinger transformation
mod <- pcr(ImbrieKipp, SumSST, tranFun = Hellinger)
mod
## formula interface, but as above
mod2 <- pcr(SumSST ~ ., data = ImbrieKipp, tranFun = Hellinger)
mod2
## Several standard methods are available
fitted(mod, comps = 1:4)
resid(mod, comps = 1:4)
coef(mod, comps = 1:4)
## Eigenvalues can be extracted
eigenvals(mod)
## screeplot method
screeplot(mod)
# }
Run the code above in your browser using DataLab