redist.smc
uses a Sequential Monte Carlo algorithm to
generate nearly independent congressional or legislative redistricting
plans according to contiguity, population, compactness, and administrative
boundary constraints.
redist.smc(
adjobj,
popvec,
nsims,
ndists,
counties = NULL,
popcons = 0.01,
compactness = 1,
resample = TRUE,
constraint_fn = function(m) rep(0, ncol(m)),
adapt_k_thresh = 0.95,
seq_alpha = 0.1 + 0.2 * compactness,
truncate = (compactness != 1),
trunc_fn = function(x) pmin(x, 0.01 * nsims^0.4),
max_oversample = 20,
verbose = TRUE,
silent = FALSE
)
An adjacency matrix, list, or object of class "SpatialPolygonsDataFrame."
A vector containing the populations of each geographic unit.
The number of samples to draw.
The number of districts in each redistricting plan.
A vector containing county (or other administrative or
geographic unit) labels for each unit, which must be integers ranging from 1
to the number of counties. If provided, the algorithm will only generate
maps which split up to ndists-1
counties. If no county-split
constraint is desired, this parameter should be left blank.
The desired population constraint. All sampled districts
will have a deviation from the target district size no more than this value
in percentage terms, i.e., popcons=0.01
will ensure districts have
populations within 1% of the target population.
Controls the compactness of the generated districts, with higher values preferring more compact districts. Must be nonnegative. See the 'Details' section for more information, and computational considerations.
Whether to perform a final resampling step so that the
generated plans can be used immediately. Set this to FALSE
to perform
direct importance sampling estimates, or to adjust the weights manually.
A function which takes in a matrix where each column is a redistricting plan and outputs a vector of log-weights, which will be added the the final weights.
The threshold value used in the heuristic to select a
value k_i
for each splitting iteration. Set to 0.9999 or 1 if
the algorithm does not appear to be sampling from the target distribution.
Must be between 0 and 1.
The amount to adjust the weights by at each resampling step; higher values prefer exploitation, while lower values prefer exploration. Must be between 0 and 1.
Whether to truncate the importance sampling weights at the
final step by trunc_fn
. Recommended if compactness
is not 1.
A function which takes in a vector of weights and returns a truncated vector. Recommended to specify this manually if truncating weights.
How much oversampling to allow at each stage; used to control memory and computation time. If the algorithm is not producing the desired nubmer of samples, this should be increased.
Whether to print out intermediate information while sampling. Recommended.
Whether to supress all diagnostic information.
redist.smc
returns an object of class redist
, which
is a list containing the following components:
The adjacency list used to sample
The matrix of sampled plans. Each row is a geographical unit, and each column is a sample.
The importance sampling weights, normalized to sum to 1.
The number of plans sampled.
The population constraint.
The compactness constraint.
The maximum population deviation of each sample.
The provided vector of unit populations.
The provided county vector.
The provided control parameter.
The provided control vector.
The provided control vector.
The algorithm used, here "smc"
.
This function draws nearly-independent samples from a specific target measure,
controlled by the popcons
, compactness
, and constraint_fn
parameters.
Higher values of compactness
sample more compact districts;
setting this parameter to 1 is computationally efficient and generates nicely
compact districts. Values of other than 1 may lead to highly variable
importance sampling weights. By default these weights are truncated at
nsims^0.04 / 100
to stabilize the resulting estimates, but if truncation
is used, a specific truncation function should probably be chosen by the user.
Because of the randomness inherent in the algorithm and the way it samples,
this function is not guaranteed to produce exactly nsims
samples.
Failure to do so is usually a result of a hard-to-meet population constraint,
especially when there are many districts. Increasing max_oversample
should generally alleviate this problem.
# NOT RUN {
data(algdat.p10)
sampled_plans = redist.smc(algdat.pfull$adjlist, algdat.pfull$precinct.data$pop,
nsims=10000, ndists=3, popcons=0.1)
# }
Run the code above in your browser using DataLab