Usage
blockwiseConsensusModules(
multiExpr,
# Data checking options
checkMissingData = TRUE,
# Blocking options
blocks = NULL,
maxBlockSize = 5000,
randomSeed = 12345,
# TOM precalculation arguments, if available
individualTOMInfo = NULL,
useIndivTOMSubset = NULL,
# Network construction arguments: correlation options
corType = "pearson",
maxPOutliers = 1,
quickCor = 0,
pearsonFallback = "individual",
cosineCorrelation = FALSE,
# Adjacency function options
power = 6,
networkType = "unsigned",
checkPower = TRUE,
# Topological overlap options
TOMType = "unsigned",
TOMDenom = "min",
# Save individual TOMs?
saveIndividualTOMs = TRUE,
individualTOMFileNames = "individualTOM-Set%s-Block%b.RData",
# Consensus calculation options
consensusQuantile = 0,
scaleTOMs = TRUE, scaleQuantile = 0.95,
# Sampling for scaling (speeds up calculation)
sampleForScaling = TRUE, sampleForScalingFactor = 1000,
getTOMScalingSamples = FALSE,
# Returning the consensus TOM
saveTOMs = FALSE,
consensusTOMFileNames = "consensusTOM-block.%b.RData",
# Internal handling of TOMs
useDiskCache = TRUE, chunkSize = NULL,
cacheBase = ".blockConsModsCache",
# Basic tree cut options
deepSplit = 2,
detectCutHeight = 0.995, minModuleSize = 20,
checkMinModuleSize = TRUE,
# Advanced tree cut opyions
maxCoreScatter = NULL, minGap = NULL,
maxAbsCoreScatter = NULL, minAbsGap = NULL,
pamStage = TRUE, pamRespectsDendro = TRUE,
# Gene reassignment and trimming from a module, and module "significance" criteria
reassignThresholdPS = 1e-4,
trimmingConsensusQuantile = consensusQuantile,
minCoreKME = 0.5, minCoreKMESize = minModuleSize/3,
minKMEtoStay = 0.2,
# Module eigengene calculation options
impute = TRUE,
trapErrors = FALSE,
#Module merging options
mergeCutHeight = 0.15,
mergeConsensusQuantile = consensusQuantile,
# Output options
numericLabels = FALSE,
# General options
nThreads = 0,
verbose = 2, indent = 0)
Arguments
multiExpr
expression data in the multi-set format (see checkSets
). A vector of
lists, one per set. Each set must contain a component data
that contains the expression data, with
rows corresponding to checkMissingData
logical: should data be checked for excessive numbers of missing entries in
genes and samples, and for genes with zero variance? See details.
blocks
optional specification of blocks in which hierarchical clustering and module detection
should be performed. If given, must be a numeric vector with one entry per gene
of multiExpr
giving the number of the block to which the corresponding ge
maxBlockSize
integer giving maximum block size for module detection. Ignored if blocks
above is non-NULL. Otherwise, if the number of genes in datExpr
exceeds maxBlockSize
, genes
will be pre-clustered into blocks whose size sho
randomSeed
integer to be used as seed for the random number generator before the function
starts. If a current seed exists, it is saved and restored upon exit. If NULL
is given, the
function will not save and restore the seed.
individualTOMInfo
Optional data for TOM matrices in individual data sets. This object is returned by
the function blockwiseIndividualTOMs
. If not given, appropriate topological overlaps will be
calculated u useIndivTOMSubset
If individualTOMInfo
is given, this argument allows to only select a subset
of the individual set networks contained in individualTOMInfo
. It should be a numeric vector giving the
indices of the individual sets to be used. Note
corType
character string specifying the correlation to be used. Allowed values are (unique
abbreviations of) "pearson"
and "bicor"
, corresponding to Pearson and bidweight
midcorrelation, respectively. Missing values are handled using t
maxPOutliers
only used for corType=="bicor"
. Specifies the maximum percentile of data
that can be considered outliers on either
side of the median separately. For each side of the median, if
higher percentile than maxPOutliers
is conside
quickCor
real number between 0 and 1 that controls the handling of missing data in the
calculation of correlations. See details.
pearsonFallback
Specifies whether the bicor calculation, if used, should revert to Pearson when
median absolute deviation (mad) is zero. Recongnized values are (abbreviations of)
"none", "individual", "all"
. If set to
"none"
, zero mad will re
cosineCorrelation
logical: should the cosine version of the correlation calculation be used? The
cosine calculation differs from the standard one in that it does not subtract the mean.
power
soft-thresholding power for netwoek construction.
networkType
network type. Allowed values are (unique abbreviations of) "unsigned"
,
"signed"
, "signed hybrid"
. See adjacency
. checkPower
logical: should basic sanity check be performed on the supplied power
? If
you would like to experiment with unusual powers, set the argument to FALSE
and proceed with
caution.
TOMType
one of "none"
, "unsigned"
, "signed"
. If "none"
, adjacency
will be used for clustering. If "unsigned"
, the standard TOM will be used (more generally, TOM
function will receive the adjacency
TOMDenom
a character string specifying the TOM variant to be used. Recognized values are
"min"
giving the standard TOM described in Zhang and Horvath (2005), and "mean"
in which
the min
function in the denominator is repl
saveIndividualTOMs
logical: should individual TOMs be saved to disk for later use?
individualTOMFileNames
character string giving the file names to save individual TOMs into. The
following tags should be used to make the file names unique for each set and block: %s
will be
replaced by the set number; %N
will be replaced by the set
consensusQuantile
qunatile at which consensus is to be defined. See details.
scaleTOMs
should set-specific TOM matrices be scaled to the same scale?
scaleQuantile
if scaleTOMs
is TRUE
, topological overlaps (or adjacencies if
TOMs are not computed) will be scaled such that their scaleQuantile
quantiles will agree.
sampleForScaling
if TRUE
, scale quantiles will be determined from a sample of network
similarities. Note that using all data can double the memory footprint of the function and the function
may fail.
sampleForScalingFactor
determines the number of samples for scaling: the number is
1/scaleQuantile * sampleForScalingFactor
. Should be set well above 1 to ensure accuracy of the
sampled quantile.
getTOMScalingSamples
logical: should samples used for TOM scaling be saved for future analysis?
This option is only available when sampleForScaling
is TRUE
.
saveTOMs
logical: should the consensus topological overlap matrices for each block be saved
and returned?
consensusTOMFileNames
character string containing the file namefiles containing the
consensus topological overlaps. The tag %b
will be replaced by the block number. If the resulting file
names are non-unique (for example, because the user gives a file name witho
useDiskCache
should calculated network similarities in individual sets be temporarilly saved
to disk? Saving to disk is somewhat slower than keeping all data in memory, but for large blocks and/or
many sets the memory footprint may be too big.
chunkSize
network similarities are saved in smaller chunks of size chunkSize
.
cacheBase
character string containing the desired name for the cache files. The actual file
names will consists of cacheBase
and a suffix to make the file names unique.
deepSplit
integer value between 0 and 4. Provides a simplified control over how sensitive
module detection should be to module splitting, with 0 least and 4 most sensitive. See
cutreeDynamic
for detectCutHeight
dendrogram cut height for module detection. See
cutreeDynamic
for more details. minModuleSize
minimum module size for module detection. See
cutreeDynamic
for more details. checkMinModuleSize
logical: should sanity checks be performed on minModuleSize
?
maxCoreScatter
maximum scatter of the core for a branch to be a cluster, given as the fraction
of cutHeight
relative to the 5th percentile of joining heights. See
cutreeDynamic
for more minGap
minimum cluster gap given as the fraction of the difference between cutHeight
and
the 5th percentile of joining heights. See cutreeDynamic
for more details. maxAbsCoreScatter
maximum scatter of the core for a branch to be a cluster given as absolute
heights. If given, overrides maxCoreScatter
. See cutreeDynamic
for more details. minAbsGap
minimum cluster gap given as absolute height difference. If given, overrides
minGap
. See cutreeDynamic
for more details. pamStage
logical. If TRUE, the second (PAM-like) stage of module detection will be performed.
See cutreeDynamic
for more details. pamRespectsDendro
Logical, only used when pamStage
is TRUE
.
If TRUE
, the PAM stage will
respect the dendrogram in the sense an object can be PAM-assigned only to clusters that lie below it on
the branch that the object is merged in
reassignThresholdPS
per-set p-value ratio threshold for reassigning genes between modules.
See Details.
trimmingConsensusQuantile
a number between 0 and 1 specifying the consensus quantile used for kME
calculation that determines module trimming according to the arguments below.
minCoreKME
a number between 0 and 1. If a detected module does not have at least
minModuleKMESize
genes with eigengene connectivity at least minCoreKME
, the module is
disbanded (its genes are unlabeled and returned to the pool of genes wa
minCoreKMESize
see minCoreKME
above.
minKMEtoStay
genes whose eigengene connectivity to their module eigengene is lower than
minKMEtoStay
are removed from the module.
impute
logical: should imputation be used for module eigengene calculation? See
moduleEigengenes
for more details. trapErrors
logical: should errors in calculations be trapped?
mergeCutHeight
dendrogram cut height for module merging.
mergeConsensusQuantile
consensus quantile for module merging. See mergeCloseModules
for
details.
numericLabels
logical: should the returned modules be labeled by colors (FALSE
), or by
numbers (TRUE
)?
nThreads
non-negative integer specifying the number of parallel threads to be used by certain
parts of correlation calculations. This option only has an effect on systems on which a POSIX thread
library is available (which currently includes Linux and Mac OSX, b
verbose
integer level of verbosity. Zero means silent, higher values make the output
progressively more and more verbose.
indent
indentation for diagnostic messages. Zero means no indentation, each unit adds
two spaces.