This function selects an optimal mass value for Cluster Analysis via Random Partition Distribtuions, using the Ewens-Pitman Attraction distribution.
default.mass(
mass,
list.epam,
dis,
new.draws = TRUE,
w = c(1, 1, 1),
discount = 0,
temp = 10,
loss = "binder",
n.draws = 100L,
two.stage = TRUE,
parallel = TRUE
)# S3 method for shallot.default.mass
print(x, ...)
optional, a vector of mass values.
optional, a list of expected pairwise allocation matrices.
Each matrix in the list needs the attributes "mass
" and
"n.draws
".
a dissimilarity structure of class dist
.
logical; if TRUE
then new draws are obtained at each
mass value.
a vector of length 3 of the weights to be used in the
mass.algorithm
.
parameter of the Ewens-Pitman Attraction distribution.
temperature parameter of the Ewens-Pitman Attraction distribution.
One of "binder"
or "VI.lb"
to indicate
the optimization should seek to minimize the expectation of the
Binder loss (Binder 1978) or the lower bound of the expectation of the variation of
information loss (Wade & Ghahramani 2017), respectively.
number of draws of partitions to be obtained at each mass value.
logical; if TRUE
, the two stage algorithm is
implemented in mass.algorithm
.
logical; if TRUE
computations will take advantage
multiple CPU cores.
An object from the default.mass
function.
currently ignored
An object of class shallot.default.mass
. This object is a list
containing a matrix of `best' possible mass values to maximize partition
confidence and minimize the variance ratio, the clustering estimate, the
expected pairwise allocation matrix, parameters used for optimization and
the EPA distribution, and the list of expected pairwise allocation matrices
for each mass value.
The function draws n.draws
partitions at each specified mass value. If
a vector of mass values is not given, then the default of
seq(0.1,10,0.2)
is used for loss
"VI.lb"
and seq(0.1,5,0.05)
used for
the other loss functions.
If a list of expected pairwise allocation matrices (EPAM) is provided, additional draws at matching mass values are added to the corresponding matrix. Additionally, no new draws are needed for estimation, if a list of EPAMs is provided.
A partition/clustering estimate from each EPAM is obtained using the SALSO
method in salso
. The estimate given minimizes the
specified loss
function with respect to the EPAM.
The function then uses the mass.algorithm
to select the optimal
mass value for clustering estimation.
Other Default Mass Selection:
mass.algorithm()
,
partition.confidence()
,
variance.ratio()