The function optimizes a set of partitions based on the value of a criterion function (see critFunC
for details on the criterion function) for a given network and blockmodel for Generalized blockmodeling (<U+017D>iberna, 2007) based on other parameters (see below).
The optimization is done through local optimization, where the neighborhood of a partition includes all partitions that can be obtained by moving one unit from one cluster to another or by exchanging two units (from different clusters).
A list of paritions can or the number of clusters and a number of partitions to generate can be specified (optParC
).
optRandomParC(M, k, approaches, blocks, rep, save.initial.param =
TRUE, save.initial.param.opt = FALSE, deleteMs = TRUE,
max.iden = 10, switch.names = NULL, return.all =
FALSE, return.err = TRUE, seed = NULL, RandomSeed =
NULL, parGenFun = genRandomPar, mingr = NULL, maxgr =
NULL, addParam = list(genPajekPar = TRUE, probGenMech
= NULL), maxTriesToFindNewPar = rep * 10, skip.par =
NULL, useOptParMultiC = FALSE, useMulti =
useOptParMultiC, printRep = ifelse(rep
A matrix representing the (usually valued) network. For multi-relational networks, this should be an array with the third dimension representing the relation. The network can have one or more modes (diferent kinds of units with no ties among themselves). If the network is not two-mode, the matrix must be square.
The number of clusters used in the generation of partitions.
One of the approaches (for each relation in multi-relational netowrks in a vector) described in <U+017D>iberna (2007). Possible values are:
"bin" - binary blockmodeling,
"val" - valued blockmodeling,
"hom" - homogeneity blockmodeling,
"ss" - sum of squares homogeneity blockmodeling, and
"ad" - absolute deviations homogeneity blockmodeling.
The last two options are "shorthand" for specifying approaches="hom"
and homFun
to either "ss"
or "ad"
.
A vector, a list of vectors or an array with names of allowed blocy types.
Only listing of allowed block types (blockmodel is not pre-specified).
A vector with names of allowed blocktypes. For multi-relational networks, it can be a list of such vectors. For approaches = "bin"
or approaches = "val"
, at least two should be selected. Possible values are:
"nul"
- null or empty block
"com"
- complete block
"rdo"
, "cdo"
- row and column-dominant blocks (binary and valued approach only)
"reg"
- (f-)regular block
"rre"
, "cre"
- row and column-(f-)regular blocks
"rfn"
, "cfn"
- row and column-dominant blocks (binary, valued only)
"den"
- density block (binary approach only)
"avg"
- average block (valued approach only)
"dnc"
- do not care block - the error is always zero
The ordering is important, since if several block types have identical error, the first on the list is selected.
A pre-specified blockmodel.
An array with dimensions four dimensions (see example below). The third and the fourth represent the clusters (for rows and columns). The first is as long as the maximum number of allows block types for a given block. If some block has less possible block types, the empty slots should have values NA
. The second dimension is the number of relations (1 for single-relational networks). The values in the array should be the ones from above. The array can have only three dimensions in case of one-relational networks or if the same pre-specified blockmodel is assumed for all relations. Further, it can have only two dimensions, if in addition only one block type is allowed per block.
The number of repetitions/different starting partitions to check.
Should the initial parameters (approaches
, ...) be saved. The default value is TRUE
.
Should the inital parameters(approaches
, ...) of using optParC
be saved. The default value is FALSE
.
Delete networks/matrices from the results of to save space.
Maximum number of results that should be saved (in case there are more than max.iden
results with minimal error, only the first max.iden
will be saved).
Should partitions that only differ in group names be considered equal.
If FALSE
, solution for only the best (one or more) partition/s is/are returned.
Should the error for each optimized partition be returned.
Optional. The seed for random generation of partitions.
Optional. Integer vector, containing the random number generator. It is only looked for in the user's workspace.
The function (object) that will generate random partitions. The default function is genRandomPar
. The function has to accept the following parameters: k
(number o of partitions by modes, n
(number of units by modes), seed
(seed value for random generation of partition), addParam
(a list of additional parameters).
Minimal allowed group size.
Maximal allowed group size.
A list of additional parameters for function specified above. In the usage section they are specified for the default function genRandomPar
.
Should the partitions be generated as in Pajek.
Should the probabilities for different mechanisms for specifying the partitions be set. If probGenMech
is not set, it is determined based on the parameter genPajekPar
.
The maximum number of partition try when trying to find a new partition to optimize that was not yet checked before - the default value is rep * 1000
.
The partitions that are not allowed or were already checked and should therefore be skipped.
For backward compatibility. May be removed soon. See next argument.
Which version of local search should be used. Default is currently FALSE
. If FALSE
, first possible all moves in random order and then all possible exchanges in random order are tried. When a move with lower value of criterion function is found, the algorithm moves to this new partition. If TRUE
the version of local search where all possible moves and exchanges are tried first and then the one with the lowest error is selected and used. In this case, several optimal partitions are found. maxPar
best partitions are returned.
Should some information about each optimization be printed.
The number of units by "modes". It is used only for generating random partitions. It has to be set only if there are more than two modes or if there are two modes, but the matrix representing the network is one mode (both modes are in rows and columns).
Number of cores to be used. Value 0
means all available cores. It can also be a cluster object.
Arguments passed to other functions, see critFunC
.
The matrix of the network analyzed.
If return.all = TRUE
- A list of results the same as best
- one best
for each partition optimized.
A list of results from crit.fun.tmp
with the same elements as the result of crit.fun
, only without M
.
If return.err = TRUE
- The vector of errors or inconsistencies of the empirical network with the ideal network for a given blockmodel (model,approach,...) and parititions.
The vector of the number of iterations used - one value for each starting partition that was optimized. It can show that maxiter
is too low if a lot of these values have the value of maxiter
.
If selected - A list of checked partitions. If merge.save.skip.par
is TRUE
, this list also includes the partitions in skip.par
.
The call used to call the function.
If selected - The initial parameters are used.
It should be noted that the time complexity of package blockmodeling is increasing with the number of units and the number of clusters (due to its algorithm). Therefore the analysis of network with more than 100 units can take a lot of time (from a few hours to a few days).
Batagelj, V., & Mrvar, A. (2006). Pajek 1.11. Retrieved from http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Doreian, P., Batagelj, V. & Ferligoj, A. (2005). Generalized blockmodeling, (Structural analysis in the social sciences, 25). Cambridge [etc.]: Cambridge University Press.
<U+017D>iberna, A. (2007). Generalized Blockmodeling of Valued Networks. Social Networks, 29(1), 105-126. doi: 10.1016/j.socnet.2006.04.002
<U+017D>iberna, A. (2008). Direct and indirect approaches to blockmodeling of valued networks in terms of regular equivalence. Journal of Mathematical Sociology, 32(1), 57-84. doi: 10.1080/00222500701790207
<U+017D>iberna, A. (2014). Blockmodeling of multilevel networks. Social Networks, 39(1), 46-61. doi: 10.1016/j.socnet.2014.04.002
# NOT RUN {
n <- 8 # If larger, the number of partitions increases dramatically
# as does if we increase the number of clusters
net <- matrix(NA, ncol = n, nrow = n)
clu <- rep(1:2, times = c(3, 5))
tclu <- table(clu)
net[clu == 1, clu == 1] <- rnorm(n = tclu[1] * tclu[1], mean = 0, sd = 1)
net[clu == 1, clu == 2] <- rnorm(n = tclu[1] * tclu[2], mean = 4, sd = 1)
net[clu == 2, clu == 1] <- rnorm(n = tclu[2] * tclu[1], mean = 0, sd = 1)
net[clu == 2, clu == 2] <- rnorm(n = tclu[2] * tclu[2], mean = 0, sd = 1)
# We select a random partition and then optimize it
all.par <- nkpartitions(n = n, k = length(tclu))
# Forming the partitions
all.par <- lapply(apply(all.par, 1, list), function(x)x[[1]])
# Optimizing one partition
res <- optParC(M = net,
clu = all.par[[sample(1:length(all.par), size = 1)]],
approaches = "hom", homFun = "ss", blocks = "com")
plot(res) # Hopefully we get the original partition
# Optimizing 10 random chosen partitions with optRandomParC
res <- optRandomParC(M = net, k = 2, rep = 10,
approaches = "hom", homFun = "ss", blocks = "com")
plot(res) # Hopefully we get the original partition
# }
Run the code above in your browser using DataLab