MCgof
implements and extends the Monte Carlo resampling method of Choo et al. (2024) to emulate Bayesian posterior predictive checks (Gelman et al. 1996, Royle et al. 2014). Initial results suggest the approach is more informative than the deviance-based test proposed by Borchers and Efford (2008) and implemented in secr.test
.
However, the tests have limited power.
MCgof
is under development. The structure of the output may change and
bugs may be found. See Warning below for exclusions.
# S3 method for secr
MCgof(object, nsim = 100, statfn = NULL, testfn = NULL, seed = NULL,
ncores = 1, clustertype = c("PSOCK", "FORK"), usefxi = TRUE,
useMVN = TRUE, Ndist = NULL, quiet = FALSE, ...)# S3 method for secrlist
MCgof(object, nsim = 100, statfn = NULL, testfn = NULL, seed = NULL,
ncores = 1, clustertype = c("PSOCK", "FORK"), usefxi = TRUE,
useMVN = TRUE, Ndist = NULL, quiet = FALSE, ...)
Invisibly returns an object of class `MCgof' with components -
as input
as input or default
as input or default
list of outputs: for each statistic, a 3 x nsim matrix. Rows correspond to Tobs, Tsim, and a binary indicator for Tsim > Tobs
execution time in seconds
For secrlist input the value returned is a list of `MCgof' objects.
secr fitted model or secrlist
object
integer number of replicates
function to extract summary statistics from capture histories
function to compare observed and expected counts
integer seed
integer for number of parallel cores
character cluster type for parallel::makeCluster
logical; if FALSE then AC are simulated de novo from the density process rather than using information on the detected individuals
logical; if FALSE parameter values are fixed at the MLE rather than drawn from multivariate normal distribution
character; distribution of number of unobserved AC (optional)
logical; if FALSE then a progress bar (ncores=1) and final timing are shown
other arguments (not used)
Not all models are covered and some are untested. These models are specifically excluded -
multi-session models
models with groups
conditional likelihood
polygon, transect, telemetry or signal detectors
non-binary behavioural responses
This implementation extends the work of Choo et al. (2024) in these respects -
detector types `multi' and `count' are allowed
the model may include variation among detectors
the model may include behavioural responses
2-class finite mixture and hybrid mixture models are both allowed.
Murray Efford and Yan Ru Choo
At each replicate parameter values are sampled from the multivariate-normal sampling distribution of the fitted model. The putative location of each detected individual is drawn from the spatial distribution implied by its observations and the resampled parameters (see fxi
); locations of undetected individuals are simulated from the complement of pdot(x) times D(x).
New detections are simulated under the model for individuals at the simulated locations, along with the expected numbers. Detections form a capthist object, a 3-D array with dimensions for individual \(i\), occasion \(j\) and detector \(k\)*. Thus for each replicate and detected individual there are the original observations \(y_{ijk}\), simulated observations \(Y_{ijk}\), and expected counts \(\mbox E (y_{ijk})\). Two discrepancy statistics are calculated for each replicate -- observed vs expected counts, and simulated vs expected counts -- and a record is kept of which of these discrepancy statistics is the larger (indicating poorer fit).
* Notation differs slightly from Choo et al. (2024), using \(j\) for occasion and \(k\) for detector to be consistent with usage in secr and elsewhere (e.g., Borchers and Fewster 2016).
The default discrepancy (testfn
) is the Freeman-Tukey statistic as in Choo et al. (2024) and Royle et al. (2014) (see also Brooks, Catchpole and Morgan 2000). The statistic has this general form for \(M\) counts \(y_m\) with expected value \(\mbox E(y_m)\):
$$T = \sum_{m=1}^{m=M} \left(\sqrt {y_m} - \sqrt{E(y_m)}\right)^2.$$
The key output of MCgof
is the proportion of replicates in which the simulated discrepancy exceeds the observed discrepancy. For perfect fit this will be about 0.5, and for poor fit it will approach zero.
By default, tests are performed separately for three types of count: the numbers of detections of each individual (yi), at each detector (yk), and for each individual at each detector (yik) extracted by the default statfn
from the margins of the observed and simulated capture histories.
\(y_{ik} = \sum_j y_{ijk}\) | individual x detector | \(y_{i} = \sum_j \sum_k y_{ijk}\) | |
individual | \(y_{k} = \sum_i \sum_j y_{ijk}\) |
Parallel processing is offered using multiple cores (CPUs) through the package parallel when ncores > 1. This differs from the usual multithreading paradigm in secr and does not rely on the environment variable set by setNumThreads
except that, if ncores = NULL, ncores will be set to the value from setNumThreads
. The cluster type "FORK" is available only on Unix-like systems; it can require large amounts of memory, but is generally fast. A small value of ncores>1 may be optimal, especially With cluster type "PSOCK".
`usefxi' and `useMVN' may be used to drop key elements of the Choo et al. (2024) approach - they are provided for demonstration only.
`Ndist' refers to the distribution of the number of unobserved AC, conditional on the expected number \(q = D^*A - n\) where \(D^*\) is the resampled density, \(A\) the mask area, and \(n\) the number of detected individuals. By default `Ndist' depends on the distribution component of the `details' argument of the fitted model (``poisson" for Poisson \(n\), ``fixed"" for binomial \(n\)).
The RNGkind
of the random number generator is set internally for consistency across platforms.
Borchers, D. L. and Efford, M. G. (2008) Spatially explicit maximum likelihood methods for capture--recapture studies. Biometrics 64, 377--385.
Borchers, D. L. and Fewster, R. M. (2016) Spatial capture--recapture models. Statistical Science 31, 219--232.
Brooks, S. P., Catchpole, E. A. and Morgan, B. J. T. (2000) Bayesian animal survival estimation. Statistical Science 15, 357--376.
Choo, Y. R., Sutherland, C. and Johnston, A. (2024) A Monte Carlo resampling framework for implementing goodness-of-fit tests in spatial capture-recapture model Methods in Ecology and Evolution DOI: 10.1111/2041-210X.14386.
Gelman, A., Meng, X.-L., and Stern, H. (1996) Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica 6, 733--807.
Royle, J. A., Chandler, R. B., Sollmann, R. and Gardner, B. (2014) Spatial capture--recapture. Academic Press.
Parallel,
secr.test
,
plot.MCgof
,
hist.MCgof
,
summary.MCgof
# \donttest{
tmp <- MCgof(secrdemo.0)
summary(tmp)
par(mfrow = c(1,3), pty = 's')
plot(tmp)
# }
Run the code above in your browser using DataLab