This function performs model validation for generalized linear geostatistical models (Binomial and Poisson) using Monte Carlo methods based on the variogram.
variog.diagnostic.glgm(
object,
n.sim = 200,
uvec = NULL,
plot.results = TRUE,
which.test = "both"
)
an object of class "PrevMap" obtained as an output from binomial.logistic.MCML
and poisson.log.MCML
.
integer indicating the number of simulations used for the variogram-based diagnostics.
Defeault is n.sim=1000
.
a vector with values used to define the variogram binning. If uvec=NULL
, then uvec
is then set to seq(MIN_DIST,(MAX_DIST-MIN_DIST)/2,length=15)
if plot.results=TRUE
, a plot is returned showing the results for the selected test(s) for spatial correlation. By default plot.results=TRUE
.
defined as the distance at which the fitted spatial correlation is no less than 0.05. Default is range.fact=1
a character specifying which test for residual spatial correlation is to be performed: "variogram", "test statistic" or "both". The default is which.test="both"
. See 'Details.'
An object of class "PrevMap.diagnostic" which is a list containing the following components:
obs.variogram
: a vector of length length(uvec)-1
containing the values of the variogram for each of
the distance bins defined through uvec
.
distance.bins
: a vector of length length(uvec)-1
containing the average distance within each of the distance bins
defined through uvec
.
n.bins
: a vector of length length(uvec)-1
containing the number of pairs of data-points falling within each distance bin.
lower.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the lower limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
upper.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the upper limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
mode.rand.effects
: the predictive mode of the random effects from the fitted non-spatial generalized linear mixed model.
p.value
: (available only if which.test="both"
or which.test="test statistic"
) p-value of the test for residual spatial correlation.
lse.variogram
: (available only if lse.variogram=TRUE
) a vector of length length(uvec)-1
containing the values of the estimated Matern variogram via a weighted least square fit.
The function takes as an input through the argument object
a fitted
generalized linear geostaistical model for an outcome \(Y_i\), with linear predictor
$$\eta_i=d_i'\beta+S(x_i)+Z_i$$
where \(d_i\) is a vector of covariates which are specified through formula
, \(S(x_i)\) is a spatial Gaussian process and the \(Z_i\) are assumed to be zero-mean Gaussian.
The model validation is performed on the adopted satationary and isotropic Matern covariance function used for \(S(x_i)\).
More specifically, the function allows the users to select either of the following validation procedures.
Variogram-based graphical validation
This graphical diagnostic is performed by setting which.test="both"
or which.test="variogram"
. The output are 95
(see below lower.lim
and upper.lim
) that are generated under the assumption that the fitted model did generate the analysed data-set.
This validation procedure proceed through the following steps.
1. Obtain the mean, say \(\hat{Z}_i\), of the \(Z_i\) conditioned on the data \(Y_i\) and by setting \(S(x_i)=0\) in the equation above.
2. Compute the empirical variogram using \(\hat{Z}_i\)
3. Simulate n.sim
data-sets under the fitted geostatistical model.
4. For each of the simulated data-sets and obtain \(\hat{Z}_i\) as in Step 1. Finally, compute the empirical variogram based on the resulting \(\hat{Z}_i\).
5. From the n.sim
variograms obtained in the previous step, compute the 95
If the observed variogram (obs.variogram
below), based on the \(\hat{Z}_i\) from Step 2, falls within the 95
evidence against the fitted spatial correlation model; if, instead, that partly falls outside the 95
correlation in the data.
Test for suitability of the adopted correlation function
This diagnostic test is performed if which.test="both"
or which.test="test statistic"
. Let \(v_{E}(B)\) and \(v_{T}(B)\) denote the empirical and theoretical variograms based on \(\hat{Z}_i\) for the distance bin \(B\).
The test statistic used for testing residual spatial correlation is
$$T = \sum_{B} N(B) \{v_{E}(B)-v_{T}(B)\}$$
where \(N(B)\) is the number of pairs of data-points falling within the distance bin \(B\) (n.bins
below).
To obtain the distribution of the test statistic \(T\) under the null hypothesis that the fitted model did generate the analysed data-set, we use the simulated empirical variograms as obtained in step 5 of the iterative procedure described in "Variogram-based graphical validation." The p-value for the test of suitability of the fitted spatial correlation function is then computed by taking the proportion of simulated values for \(T\) that are larger than the value of \(T\) based on the original \(\hat{Z}_i\) in Step 1.