Learn R Programming

analogue (version 0.17-6)

pcr: Prinicpal component regression transfer function models

Description

Fits a palaeoecological transfer function model using principal component regression, using an optional transformation of the matrix of predictor variables when these are species abundance data.

Usage

# S3 method for default
pcr(x, y, ncomp, tranFun, ...)

# S3 method for formula pcr(formula, data, subset, na.action, ..., model = FALSE)

Hellinger(x, ...)

ChiSquare(x, apply = FALSE, parms)

# S3 method for pcr performance(object, ...)

# S3 method for pcr residuals(object, comps = NULL, ...)

# S3 method for pcr fitted(object, comps = NULL, ...)

# S3 method for pcr coef(object, comps = NULL, ...)

# S3 method for pcr screeplot(x, restrict = NULL, display = c("RMSE","avgBias","maxBias","R2"), xlab = NULL, ylab = NULL, main = NULL, sub = NULL, ...)

# S3 method for pcr eigenvals(x, ...)

Value

Returns an object of class "pcr", a list with the following components:

fitted.values

matrix; the PCR estimates of the response. The columns contain fitted values using C components, where C is the Cth column of the matrix.

coefficients

matrix; regression coefficients for the PCR. Columns as per fitted above.

residuals

matrix; residuals, where the Cth column represents a PCR model using C components.

scores

loadings

Yloadings

xMeans

numeric; means of the predictor variables in the training data.

yMean

numeric; mean of the response variable in the training data.

varExpl

numeric; variance explained by the PCR model. These are the squares of the singular values.

totvar

numeric; total variance in the training data

call

the matched call.

tranFun

transformation function used. NA if none supplied/used.

tranParms

list; meta parameters used to computed the transformed training data.

performance

data frame; cross-validation performance statistics for the model.

ncomp

numeric; number of principal components computed

Arguments

x

Matrix or data frame of predictor variables. Usually species composition or abundance data for transfer function models. For screeplot and eigenvals, an object of class "pcr".

y

Numeric vector; the response variable to be modelled.

ncomp

numeric; number of principal components to build models for. If not supplied the largest possible number of components is determined.

tranFun

function; a function or name of a function that performs a transformation of the predictor variables x. The function must be self-contained as no arguments are passed to the function when it is applied. See Details for more information.

formula

a model formula.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables specified on the RHS of the model formula. If not found in data, the variables are taken from environment(formula), typically the environment from which pcr is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The 'factory-fresh' default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

model

logical; If TRUE the model frame is returned?

apply

logical; should an existing tranformation, using pre-computed meta-parameters, be applied?

parms

list; a named list of parameters computed during model fitting that can be used to apply the transformation during prediction.

object

an object of class "pcr".

comps

numeric; which components to return.

restrict

numeric; limit the number of components on the screeplot.

display

character; which model performance statistic should be drawn on the screeplot?

xlab, ylab, main, sub

character; labels for the plot.

...

Arguments passed to other methods.

Author

Gavin L. Simpson

Details

When applying cross-validation (CV) to transfer function models, any transformation of the predictors must be applied separately during each iteration of the CV procedure to the part of the data used in fitting the model. In the same way, any samples to be predicted from the model must use any meta-parameters derived from the training data only. For examle, centring is appled to the training data only and the variables means used to centre the training data are used to centre the test samples. The variable means should not be computed on a combination of the training and test samples.

When using PCR, we might wish to apply a transformation to the species data predictor variables such that the PCA of those data preserves a dissimilarity coefficient other than the Euclidean distance. This transformation is applied to allow PCA to better describe patterns in the species data (Legendre & Gallagher 2001).

How this is handled in pcr is to take a user-supplied function that takes a single argument, the matrix of predictor variables. The function should return a matrix of the same dimension as the input. If any meta-parameters are required for subsequent use in prediction, these should be returned as attribute "parms", attached to the matrix.

Two example transformation functions are provided implementing the Hellinger and Chi Square transformations of Legendre & Gallagher (2001). Users can base their transformation functions on these. ChiSquare() illustrates how meta-parameters should be returned as the attribute "parms".

See Also

wa

Examples

Run this code
## Load the Imbrie & Kipp data and
## summer sea-surface temperatures
data(ImbrieKipp)
data(SumSST)

## normal interface and apply Hellinger transformation
mod <- pcr(ImbrieKipp, SumSST, tranFun = Hellinger)
mod

## formula interface, but as above
mod2 <- pcr(SumSST ~ ., data = ImbrieKipp, tranFun = Hellinger)
mod2

## Several standard methods are available
fitted(mod, comps = 1:4)
resid(mod, comps = 1:4)
coef(mod, comps = 1:4)

## Eigenvalues can be extracted
eigenvals(mod)

## screeplot method
screeplot(mod)

Run the code above in your browser using DataLab