Learn R Programming

SpatioTemporal (version 0.9.2)

summaryStatsCV: Computes Summary Statistics for Cross-validation

Description

Computes summary statistics for cross validation. Statistics that are computed include RMSE, R2, and coverage of CI:s; both for all observations and stratified by date.

Usage

summaryStatsCV(predCV, pred.naive = NULL, lta = FALSE, 
               by.date = FALSE, p = 0.95, trans = NULL)

Arguments

predCV
Result of a cross-validation. Should be the output from predictCV.
pred.naive
Result of naive prediction, this is used to compute modified R2 values. Should be the output from predictNaive.
lta
Compute cross-validation statistics for the long term averages at each site. If trans!=NULL the transformation will be applied before computation of the averages, see compute.lt
by.date
Compute individual cross-validation statistics for each time-point. May lead to the computation of very many statistics.
p
Approximate coverage of the computed confidence bands. The confidence bands are used when computing the coverage of the cross-validated predictions.
trans
Transform observations and predictions before computing statistics. Different values for trans give different transforms: [object Object],[object Object],[object Object]

Value

  • Returns a list containing:
  • StatsA data.frame where the columns contain RMSE, R2 and coverage of the width p confidence intervall(s). At a minimum this is computed for all observations.

    If pred.naive!=NULL four additional rows are added to Stats. These rows contain adjusted R2 that compare cross-validated predictions to predictions computed using predictNaive. The adjusted R2 are computed as (1 - MSE_cv/MSE_naive). For this to make sense the locations used for the naive predictions should not be among the locations that cross-validated predictions are computed for.

    If lta=TRUE one additional rows containing RMSE and R2 for the long term average predictions given by compute.ltaCV is added to Stats.

    If by.date=TRUE one additional rows containing RMSE, R2 and coverage is added to Stats for each unique observation date.

  • res, res.normResiduals and normalised residuals from the cross-validated predictions. Two (nbr of observations) - by (1) vectors with residuals for the observations in mesa.data.model$obs.

    The residuals are computed as: res <- (predCV$pred.obs[,"obs"] - predCV$pred.obs[,"pred"]) res.norm <- res / sqrt(predCV$pred.obs[,"pred.var"]) Here the normalised residuals are divided by the prediction standard deviation.

  • ltaA data.frame with predicted and observed long term averages at each site, or NULL if lta=FALSE. If given this is the output from: compute.ltaCV(predCV, trans) See compute.ltaCV
  • pApproximate coverage of the computed confidence bands, same as p in the input.

encoding

latin1

See Also

See createCV and estimateCV for cross-validation set-up and estimation.

For computing CV statistics, see also predictNaive and compute.ltaCV; for further illustration see plotCV and CVresiduals.qqnorm.

Examples

Run this code
##load data
data(mesa.data.model)
data(mesa.data.res)

##Extract pre-computed cross-validated predictions
pred.cv <- mesa.data.res$pred.cv

##Naive predictions based on AQS sites only
pred.N <- predictNaive(mesa.data.model, type="AQS")

##compute summary statistics
stat.CV <- summaryStatsCV(pred.cv, pred.naive=pred.N,
                          lta=TRUE, by.date=TRUE)

##study the summary statistics (for observations and long term average)
stat.CV$Stats[1:2,]

##adjusted R2 values, these are slightly strange since we
##(in this case) are basing the naive predictions on 
##things left out of the cross-validation.
stat.CV$Stats[(dim(stat.CV$Stats)[1]-3):dim(stat.CV$Stats)[1],]

##plot the RMSE for each date as a function of date
plot(as.Date(rownames(stat.CV$Stats[3:(dim(stat.CV$Stats)[1]-4),])),
     stat.CV$Stats[3:(dim(stat.CV$Stats)[1]-4),"RMSE"],
     xlab="Date",ylab="RMSE")
##add over all RMSE as reference
abline(h=stat.CV$Stats["obs","RMSE"])

##Some plots for the residuals
par(mfrow=c(2,2), mar=c(4.5,4.5,3,.5))
## residuals against observations
plot(mesa.data.model$obs$obs, stat.CV$res,
     ylab="Residuals", xlab="Observations")
## Norm-plot for the residuals
CVresiduals.qqnorm(stat.CV$res)
## Norm-plot and normalised residuals, these should be N(0,1).
CVresiduals.qqnorm(stat.CV$res.norm, norm=TRUE)
## normalised residuals against the first temporal trend
CVresiduals.scatter(stat.CV$res.norm, mesa.data.model$F[,2],
                    xlab="First temporal trend")

Run the code above in your browser using DataLab