- counts
A matrix of allelic count data, for which nrow =
the number of populations
and ncol =
the number of bi-allelic loci sampled. Each cell gives the number
of times allele `1' is observed in each population. The choice of which allele is
allele `1' is arbitrary, but must be consistent across all populations at a locus.
- sample_sizes
A matrix of sample sizes, for which nrow =
the number of populations
and ncol =
the number of bi-allelic loci sampled (i.e. - the dimensions of
sample.sizes
must match those of counts
). Each cell gives the number
of chromosomes successfully genotyped at each locus in each population.
- D
Pairwise geographic distance (\(D_{i,j}\)). This can be two-dimensional Euclidean
distance, or great-circle distance, or, in fact, any positive definite matrix (deriving,
for instance, from a resistance distance). However, note that the algorithm silently
restricts the prior on the alpha parameters, and specifically the alpha_2 parameter, to
the part of parameter space that results in valid covariance matrices; in the case of
two-dimensional Euclidean distances, this will not happen, since any value of alpha_2
between 0 and 2 is valid (see Guillot et al.'s "Valid covariance models for the analysis
of geographical genetic variation" for more detail on this).
- E
Pairwise ecological distance(s) (\(E_{i,j}\)), which may be continuous (e.g. -
difference in elevation) or binary (same or opposite side of some hypothesized
barrier to gene flow). Users may specify one or more ecological distance matrices.
If more than one is specified, they should be formatted as a list
.
- k
The number of populations in the analysis. This should be equal to
nrow(
counts)
.
- loci
The number of loci in the analysis. This should be equal to
ncol(
counts)
- delta
The size of the "delta shift" on the off-diagonal elements of the parametric
covariance matrix, used to ensure its positive-definiteness (even, for example,
when there are separate populations sampled at the same geographic/ecological
coordinates). This value must be large enough that the covariance matrix is
positive-definite, but, if possible, should be smaller than the smallest off-
diagonal distance elements, lest it have an undue impact on inference. If the
user is concerned that the delta shift is too large relative to the pairwise
distance elements in D
and E
, she should run subsequent analyses,
varying the size of delta, to see if it has an impact on model inference.
- aD_stp
The scale of the tuning parameter on aD (alphaD). The scale of the tuning
parameter is the standard deviation of the normal distribution from which small
perturbations are made to those parameters updated via a random-walk sampler.
A larger value of the scale of the tuning parameter will lead to, on average,
larger proposed moves and lower acceptance rates (for more on acceptance rates,
see plot_acceptance_rate
).
- aE_stp
The scale of the tuning parameter on aE (alphaE). If there are multiple
ecological distances included in the analysis, there will be multiple alphaE
parameters (one for each matrix in the list of E). These may be updated all
with the same scale of a tuning parameter, or they can each get their own, in
which case aE_stp should be a vector of length equal to the number of ecological
distance variables.
- a2_stp
The scale of the tuning parameter on a2 (alpha_2).
- thetas_stp
The scale of the tuning parameter on the theta parameters.
- mu_stp
The scale of the tuning parameter on mu.
- ngen
The number of generations over which to run the MCMC (one parameter is updated
at random per generation, with mu, theta, and phi all counting, for the purposes of
updates, as one parameter).
- printfreq
The frequency with which MCMC progress is printed to the screen. If
printfreq =1000
, an update with the MCMC generation number and the posterior
probability at that generation will print to the screen every 1000 generations.
- savefreq
The frequency with which the MCMC saves its output as an R object (savefreq =
50,000
means that MCMC output is saved every 50,000 generations). If ngen
is large,
this saving process may be computationally expensive, and so should not be performed
too frequently. However, users may wish to evalute MCMC performance while the chain
is still running, or may be forced to truncate runs early, and should therefore
specify a savefreq
that is less than ngen
. We recommend a
savefreq
of between 1/10th and 1/20th of ngen
.
- samplefreq
The thinning of the MCMC chain (samplefreq = 1000
means that the parameter
values saved in the MCMC output are sampled once every 1000 generations). A higher
samplefreq
will decrease parameter autocorrelation time. However, there is
still information in autocorrelated draws from the joint posterior, so the
samplefreq
should be viewed merely as a computational convenience, to decrease the size of the
MCMC output objects.
- directory
If specified, this points to a directory into which output will be saved.
- prefix
If specified, this prefix will be added to all output file names.
- continue
If TRUE
, this will initiate the MCMC chain from the last parameter values of a
previous analysis. This option can be used to effectively increase the ngen
of an initial run. If FALSE
, the MCMC will be initiated from random parameter
values.
- continuing.params
The list of parameter values used to initiate the MCMC if continue = TRUE
. If
the user wants to continue an analysis on a dataset, these should be the parameter
values from the last generation of the previous analysis. This list may be generated
using the function make.continuing.params
.