Learn R Programming

rioja (version 1.0-7)

MR: Multiple regression

Description

Functions for reconstructing (predicting) environmental values from biological assemblages using multiple regression.

Usage

MR(y, x, check.data=TRUE, lean=FALSE, ...)

MR.fit(y, x, lean=FALSE)

# S3 method for MR predict (object, newdata=NULL, sse=FALSE, nboot=100, match.data=TRUE, verbose=TRUE, ...)

# S3 method for MR crossval(object, cv.method="loo", verbose=TRUE, ngroups=10, nboot=100, h.cutoff=0, h.dist=NULL, ...)

# S3 method for MR performance(object, ...)

# S3 method for MR print(x, ...)

# S3 method for MR summary(object, full=FALSE, ...)

# S3 method for MR plot(x, resid=FALSE, xval=FALSE, xlab="", ylab="", ylim=NULL, xlim=NULL, add.ref=TRUE, add.smooth=FALSE, ...)

# S3 method for MR residuals(object, cv=FALSE, ...)

# S3 method for MR coef(object, ...)

# S3 method for MR fitted(object, ...)

Value

Function MR returns an object of class MR with the following named elements:

coefficients

species coefficients (the updated "optima").

fitted.values

fitted values for the training set.

call

original function call.

x

environmental variable used in the model.

Function crossval also returns an object of class MR and adds the following named elements:

predicted

predicted values of each training set sample under cross-validation.

residuals.cv

prediction residuals.

If function predict is called with newdata=NULL it returns the fitted values of the original model, otherwise it returns a list with the following named elements:

fit

predicted values for newdata.

If sample specific errors were requested the list will also include:

fit.boot

mean of the bootstrap estimates of newdata.

v1

standard error of the bootstrap estimates for each new sample.

v2

root mean squared error for the training set samples, across all bootstram samples.

SEP

standard error of prediction, calculated as the square root of v1^2 + v2^2.

Function performance returns a matrix of performance statistics for the MR model. See performance, for a description of the summary.

Arguments

y

a data frame or matrix of biological abundance data.

x, object

a vector of environmental values to be modelled or an object of class wa.

newdata

new biological data to be predicted.

check.data

logical to perform simple checks on the input data.

match.data

logical indicate the function will match two species datasets by their column names. You should only set this to FALSE if you are sure the column names match exactly.

lean

logical to exclude some output from the resulting models (used when cross-validating to speed calculations).

full

logical to show head and tail of output in summaries.

resid

logical to plot residuals instead of fitted values.

xval

logical to plot cross-validation estimates.

xlab, ylab, xlim, ylim

additional graphical arguments to plot.wa.

add.ref

add 1:1 line on plot.

add.smooth

add loess smooth to plot.

cv.method

cross-validation method, either "loo", "lgo", "bootstrap" or "h-block".

verbose

logical to show feedback during cross-validation.

nboot

number of bootstrap samples.

ngroups

number of groups in leave-group-out cross-validation, or a vector contain leave-out group menbership.

h.cutoff

cutoff for h-block cross-validation. Only training samples greater than h.cutoff from each test sample will be used.

h.dist

distance matrix for use in h-block cross-validation. Usually a matrix of geographical distances between samples.

sse

logical indicating that sample specific errors should be calculated.

cv

logical to indicate model or cross-validation residuals.

...

additional arguments.

Author

Steve Juggins

Details

Function MR performs multiple regrssion. It is a wrapper to lm.

Function predict predicts values of the environmental variable for newdata or returns the fitted (predicted) values from the original modern dataset if newdata is NULL. Variables are matched between training and newdata by column name (if match.data is TRUE). Use compare.datasets to assess conformity of two species datasets and identify possible no-analogue samples.

MR has methods fitted and rediduals that return the fitted values (estimates) and residuals for the training set, performance, which returns summary performance statistics (see below), coef which returns the species coefficients, and print and summary to summarise the output. MR also has a plot method that produces scatter plots of predicted vs observed measurements for the training set.

See Also

WA, MAT, performance, and compare.datasets for diagnostics.

Examples

Run this code
data(IK)
spec <- IK$spec
SumSST <- IK$env$SumSST
core <- IK$core

# Generate a MR model using taxa with max abun > 20%

mx <- apply(spec, 2, max)
spec2 <- spec[, mx > 20]

fit <- MR(spec2, SumSST)
fit
# cross-validate model
fit.cv <- crossval(fit, cv.method="lgo")
fit.cv

#predict the core
pred <- predict(fit, core)

#plot predictions - depths are in rownames
depth <- as.numeric(rownames(core))
plot(depth, pred$fit[, 1], type="b")

if (FALSE) {
# predictions with sample specific errors
# takes approximately 1 minute to run
pred <- predict(fit, core, sse=TRUE, nboot=1000)
pred
}

Run the code above in your browser using DataLab