x.validation
runs a conStruct cross-validation analysis
x.validation(
train.prop = 0.9,
n.reps,
K,
freqs = NULL,
data.partitions = NULL,
geoDist,
coords,
prefix,
n.iter,
make.figs = FALSE,
save.files = FALSE,
parallel = FALSE,
n.nodes = NULL,
...
)
This function returns (and also saves as a .Robj) a list
containing the standardized results of the cross-validation analysis across replicates. For each replicate, the function returns a list with the following elements:
sp
- the mean of the standardized log likelihoods of the
"testing" data partition of that replicate for the spatial model for
each value of K specified in K
.
nsp
- the mean of the standardized log likelihoods of the
"testing" data partitions of that replicate for the nonspatial model for
each value of K specified in K
.
In addition, this function saves two text files containing the standardized
cross-validation results for the spatial and nonspatial results
(prefix_sp_xval_results.txt and prefix_nsp_xval_results.txt, respectively).
These values are written as matrices for user convenience; each column is
a cross-validation replicate, and each row gives the result for a value of
K
.
A numeric value between 0 and 1 that gives the proportions of the data to be used in the training partition of the analysis. Default is 0.9.
An integer
giving the number of cross-
validation replicates to be run.
A numeric vector
giving the numbers of layers
to be tested in each cross-validation replicate.
E.g., K=1:7
.
A matrix
of allele frequencies with one column per
locus and one row per sample.
Missing data should be indicated with NA
.
A list with one element for each desired
cross-validation replicate. This argument can be specified
instead of the freqs
argument if the user wants to
provide their own data partitions for model training and testing.
See the model comparison vignette for details on what this
should look like.
A matrix
of geographic distance between samples.
If NULL
, user can only run the nonspatial model.
A matrix
giving the longitude and latitude
(or X and Y coordinates) of the samples.
A character vector
giving the prefix to be attached
to all output files.
An integer
giving the number of iterations each MCMC
chain is run. Default is 1e3. If the number of iterations
is greater than 500, the MCMC is thinned so that the number
of retained iterations is 500 (before burn-in).
A logical
value indicating whether to automatically
make figures during the course of the cross-validation analysis.
Default is FALSE
.
A logical
value indicating whether to automatically
save output and intermediate files once the analysis is
complete. Default is FALSE
.
A logical
value indicating whether or not to run the
different cross-validation replicates in parallel. Default is FALSE
.
For more details on how to set up runs in parallel, see the model
comparison vignette.
Number of nodes to run parallel analyses on. Default is
NULL
. Ignored if parallel
is FALSE
. For more details
in how to set up runs in parallel, see the model comparison vignette.
Further options to be passed to rstan::sampling (e.g., adapt_delta).
This function initiates a cross-validation analysis that uses Monte Carlo cross-validation to determine the statistical support for models with different numbers of layers or with and without a spatial component.