projectiveKMeans(
datExpr,
preferredSize = 5000,
nCenters = as.integer(min(ncol(datExpr)/20, preferredSize^2/ncol(datExpr))),
sizePenaltyPower = 4,
networkType = "unsigned",
randomSeed = 54321,
checkData = TRUE,
maxIterations = 1000,
verbose = 0, indent = 0)
preferredSize
."unsigned"
,
"signed"
, "signed hybrid"
. See adjacency
.NA
.This function implements a variant of K-means clustering that is suitable for co-expression analysis.
Cluster centers are defined by the first principal component, and distances by correlation (more
precisely, 1-correlation). The distance between a gene and a cluster is multiplied by a factor of
$max(clusterSize/preferredSize, 1)^{sizePenaltyPower}$, thus penalizing clusters whose size exceeds
preferredSize
. The function starts with randomly generated cluster assignment (hence the need to
set the random seed for repeatability) and executes interations of calculating new centers and
reassigning genes to nearest center until the clustering becomes stable. Before returning, nearby
clusters are iteratively combined if their combined size is below preferredSize
.
The standard principal component calculation via the function svd
fails from time to time
(likely a convergence problem of the underlying lapack functions). Such errors are trapped and the
principal component is approximated by a weighted average of expression profiles in the cluster. If
verbose
is set above 2, an informational message is printed whenever this approximation is used.