Evaluate the local quality of a geostatistical model of uncertainty (GMU) using summary measures and graphical displays.
checkGMU(
observed,
simulated,
pi = seq(0.01, 0.99, 0.01),
symmetric = TRUE,
plotit = TRUE
)
Vector of observed values at the validation points. See ‘Details’ for more information.
Data frame or matrix with simulated values (columns) for each validation point (rows). See ‘Details’ for more information.
Vector defining the width of the series of probability intervals. Defaults to
pi = seq(0.01, 0.99, 0.01)
. See ‘Details’ for more information.
Logical for choosing the type of probability interval. Defaults to
symmetric = TRUE
. See ‘Details’ for more information.
Logical for plotting the results. Defaults to plotit = TRUE
.
A list
of summary measures and plots of the coverage probability and width of probability
intervals.
There is no standard way of evaluating the local quality of a GMU. The collection of summary measures and graphical displays presented here is far from being comprehensive. A few definitions are given bellow.
Error statistics measure how well the GMU predicts the measured values at the validation points. Four error statistics are presented:
Measures the bias of the predictions of the GMU, being defined as the mean of the differences between the average of the simulated values and the observed values, i.e. the average of all simulations is taken as the predicted value.
Measures the accuracy of the predictions of the GMU, being defined as the mean of the squared differences between the average of the simulated values and the observed values.
Measures how well the GMU estimate of the prediction error variance (PEV) approximates the observed prediction error variance, where the first is given by the variance of the simulated values, while the second is given by the squared differences between the average of the simulated values, i.e. the squared error (SE). The SRMSE is computed as the average of SE / PEV, where SRMSE > 1 indicates underestimation, while SRMSE < 1 indicates overestimation.
Measures how close the GMU predictions are to the observed values. A scatter plot of the observed values versus the average of the simulated values can be used to check for possible unwanted outliers and non-linearities. The square of the Pearson correlation coefficient measures the fraction of the overall spread of observed values that is explained by the GMU, that is, the amount of variance explained (AVE), also known as coefficient of determination or ratio of scatter.
The coverage probability of an interval is given by the number of times that that interval contains its parameter over several replications of an experiment. For example, consider the interquartile range \(IQR = Q3 - Q1\) of a Gaussian distributed variable with mean equal to zero and variance equal to one. The nominal coverage probability of the IQR is 0.5, i.e. two quarters of the data fall within the IQR. Suppose we generate a Gaussian distributed random variable with the same mean and variance and count the number of values that fall within the IQR defined above: about 0.5 of its values will fall within the IQR. If we continue generating Gaussian distributed random variables with the same mean and variance, on average, 0.5 of the values will fall in that interval.
Coverage probabilities are very useful to evaluate the local quality of a GMU: the closer the observed coverage probabilities of a sequence of probability intervals (PI) are to the nominal coverage probabilities of those PIs, the better the modeling of the local uncertainty.
Two types of PIs can be used here: symmetric, median-centered PIs, and left-bounded PIs. Papritz & Dubois (1999) recommend using left-bounded PIs because they are better at evidencing deviations for both large and small PIs. The authors also point that the coverage probabilities of the symmetric, median-centered PIs can be read from the coverage probability plots produced using left-bounded PIs.
In both cases, the PIs are computed at each validation location using the quantiles of the conditional cumulative distribution function (ccdf) defined by the set of realizations at that validation location. For a sequence of PIs of increasing width, we check which of them contains the observed value at all validation locations. We then average the results over all validation locations to compute the proportion of PIs (with the same width) that contains the observed value: this gives the coverage probability of the PIs.
Deutsch (1997) proposed three summary measures of the coverage probabilities to assess the local goodness of a GMU: accuracy ($A$), precision ($P$), and goodness ($G$). According to Deutsch (1997), a GMU can be considered “good” if it is both accurate and precise. Although easy to compute, these measures seem not to have been explored by many geostatisticians, except for the studies developed by Pierre Goovaerts and his later software implementation (Goovaerts, 2009). Richmond (2001) suggests that they should not be used as the only measures of the local quality of a GMU.
An accurate GMU is that for which the proportion \(p^*\) of true values falling within the $p$ PI is equal to or larger than the nominal probability $p$, that is, when \(p^* \geq p\). In the coverage probability plot, a GMU will be more accurate when all points are on or above the 1:1 line. The range of $A$ goes from 0 (lest accurate) to 1 (most accurate).
The precision, $P$, is defined only for an accurate GMU, and measures how close \(p^*\) is to $p$. The range of $P$ goes from 0 (lest precise) to 1 (most precise). Thus, a GMU will be more accurate when all points in the PI-width plot are on or above the 1:1 line.
The goodness, $G$, is a measure of the departure of the points from the 1:1 line in the coverage probability plot. $G$ ranges from 0 (minimum goodness) to 1 (maximum goodness), the maximum $G$ being achieved when \(p^* = p\), that is, all points in both coverage probability and interval width plots are exactly on the 1:1 line.
Deutsch, C. Direct assessment of local accuracy and precision. Baafi, E. Y. & Schofield, N. A. (Eds.) Geostatistics Wollongong '96. Dordrecht: Kinwer Academic Publishers, v. I, p. 115-125, 1997.
Papritz, A. & Dubois, J. R. Mapping heavy metals in soil by (non-)linear kriging: an empirical validation. G<U+00F3>mez-Hern<U+00E1>ndez, J.; Soares, A. & Froidevaux, R. (Eds.) geoENV II -- Geostatistics for Environmental Applications. Springer, p. 429-440, 1999.
Goovaerts, P. Geostatistical modelling of uncertainty in soil science. Geoderma. v. 103, p. 3 - 26, 2001.
Goovaerts, P. AUTO-IK: a 2D indicator kriging program for the automated non-parametric modeling of local uncertainty in earth sciences. Computers & Geosciences. v. 35, p. 1255-1270, 2009.
Richmond, A. J. Maximum profitability with minimum risk and effort. Xie, H.; Wang, Y. & Jiang, Y. (Eds.) Proceedings 29th APCOM. Lisse: A. A. Balkema, p. 45-50, 2001.
Ripley, B. D. Stochastic simulation. New York: John Wiley & Sons, p. 237, 1987.
# NOT RUN {
if (interactive()) {
set.seed(2001)
observed <- round(rnorm(100), 3)
simulated <- t(
sapply(1:length(observed), function (i) round(rnorm(100), 3)))
resa <- checkGMU(observed, simulated, symmetric = T)
resb <- checkGMU(observed, simulated, symmetric = F)
resa$error; resb$error
resa$goodness; resb$goodness
}
# }
Run the code above in your browser using DataLab