Perform a single simulation run for the LSBCLUST model. Multiple data sets are generated for a single set of underlying parameters,
sim_lsbclust(ndata, nobs, size, nclust, clustsize = NULL,
delta = rep(1L, 4L), ndim = 2L, alpha = 0.5, fixed = c("none",
"rows", "columns"), err_sd = 1, svmins = 0.5, svmax = 5,
seed = NULL, parallel = FALSE, parallel_data = TRUE, verbose = 0,
nstart_T3 = 20L, nstart_ak = 20L, mc.cores = detectCores() - 1,
include_fits = FALSE, include_data = FALSE, nstart, nstart.kmeans)
Integer giving the number of data sets to generate with the same underlying parameters.
Integer giving the number of observations to sample.
Vector with two elements giving the number of rows and columns respectively of each simulated observation.
A vector of length four giving the number of clusters for the overall mean, the row margins, the column margins and the interactions (in that order) respectively. Alternatively, a vector of length one, in which case all components will have the same number of clusters.
A list of length four, with each element containing a vector
of the same length as the corresponding entry in nclust
, indicating the
number of elements to contribute to each sample. Naturally, each of these
vectors must sum to nobs
, or an error will result. Positional matching
are used, in the order "overall", "rows", "columns" and "interactions". If
NULL
, all clusters will be of equal size.
A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced.
The required rank for the approximation of the interactions (a scalar).
Numeric value in [0, 1] which determines how the singular values are distributed
between rows and columns (passed to int.lsbclust
).
One of "none"
, "rows"
or "columns"
indicating whether to fix neither
sets of coordinates, or whether to fix the row or column coordinates across clusters respectively.
If a vector is supplied, only the first element will be used (passed to int.lsbclust
).
The standard deviation of the error distribution, as passed to
rnorm
Vector of minimum values for the singular values
(as passed to simsv
). Optionally, if all minima are equal,
a single numeric value which will be expanded to the correct length.
The maximum possible singular value (as passed to simsv
)
An optional seed to be set for the random number generator
Logical indicating whether to parallelize over random starts.
Note that parallel_data
has precedence over this
Logical indicating whether to parallelize over the data sets. If
FALSE
, parallelization is done over random starts (depending on parallel
).
Integer giving the number of iterations after which the loss values is printed.
The number of random starts to use for T3Clusf
The number of random starts to use for akmeans
The number of cores to use, passed to makeCluster
Logical indicating whether to include the model fits, or or only the fit statistics
Logical indicating whether to include the simulated data fitted on, or only the results
From lsbclust
From lsbclust
# NOT RUN {
set.seed(1)
res <- sim_lsbclust(ndata = 5, nobs = 100, size = c(10, 8), nclust = rep(5, 4),
verbose = 0, nstart_T3 = 2, nstart_ak = 1, parallel_data = FALSE,
nstart = 2, nstart.kmeans = 5 )
# }
Run the code above in your browser using DataLab