Usage
blockwiseConsensusModules(
multiExpr,
# Data checking options
checkMissingData = TRUE,
# Blocking options
blocks = NULL,
maxBlockSize = 5000,
randomSeed = 12345,
# TOM precalculation arguments, if available
individualTOMInfo = NULL,
useIndivTOMSubset = NULL,
# Network construction arguments: correlation options
corType = "pearson",
maxPOutliers = 1,
quickCor = 0,
pearsonFallback = "individual",
cosineCorrelation = FALSE,
# Adjacency function options
power = 6,
networkType = "unsigned",
checkPower = TRUE,
# Topological overlap options
TOMType = "unsigned",
TOMDenom = "min",
# Save individual TOMs?
saveIndividualTOMs = TRUE,
individualTOMFileNames = "individualTOM-Set%s-Block%b.RData",
# Consensus calculation options: network calibration
networkCalibration = c("single quantile", "full quantile", "none"),
# Simple quantile calibration options
calibrationQuantile = 0.95,
sampleForCalibration = TRUE, sampleForCalibrationFactor = 1000,
getNetworkCalibrationSamples = FALSE,
# Consensus definition
consensusQuantile = 0,
useMean = FALSE,
setWeights = NULL,
# Saving the consensus TOM
saveConsensusTOMs = FALSE,
consensusTOMFileNames = "consensusTOM-block.%b.RData",
# Internal handling of TOMs
useDiskCache = TRUE, chunkSize = NULL,
cacheBase = ".blockConsModsCache",
# Alternative consensus TOM input from a previous calculation
consensusTOMInfo = NULL,
# Basic tree cut options
# Basic tree cut options
deepSplit = 2,
detectCutHeight = 0.995, minModuleSize = 20,
checkMinModuleSize = TRUE,
# Advanced tree cut opyions
maxCoreScatter = NULL, minGap = NULL,
maxAbsCoreScatter = NULL, minAbsGap = NULL,
minSplitHeight = NULL, minAbsSplitHeight = NULL,
useBranchEigennodeDissim = FALSE,
minBranchEigennodeDissim = mergeCutHeight,
stabilityLabels = NULL,
minStabilityDissim = NULL,
pamStage = TRUE, pamRespectsDendro = TRUE,
# Gene reassignment and trimming from a module, and module "significance" criteria
reassignThresholdPS = 1e-4,
trimmingConsensusQuantile = consensusQuantile,
minCoreKME = 0.5, minCoreKMESize = minModuleSize/3,
minKMEtoStay = 0.2,
# Module eigengene calculation options
impute = TRUE,
trapErrors = FALSE,
#Module merging options
equalizeQuantilesForModuleMerging = FALSE,
quantileSummaryForModuleMerging = "mean",
mergeCutHeight = 0.15,
mergeConsensusQuantile = consensusQuantile,
# Output options
numericLabels = FALSE,
# General options
nThreads = 0,
verbose = 2, indent = 0, ...)
Arguments
multiExpr
expression data in the multi-set format (see checkSets
). A vector of
lists, one per set. Each set must contain a component data
that contains the expression data, with
rows corresponding to checkMissingData
logical: should data be checked for excessive numbers of missing entries in
genes and samples, and for genes with zero variance? See details.
blocks
optional specification of blocks in which hierarchical clustering and module detection
should be performed. If given, must be a numeric vector with one entry per gene
of multiExpr
giving the number of the block to which the corresponding ge
maxBlockSize
integer giving maximum block size for module detection. Ignored if blocks
above is non-NULL. Otherwise, if the number of genes in datExpr
exceeds maxBlockSize
, genes
will be pre-clustered into blocks whose size sho
randomSeed
integer to be used as seed for the random number generator before the function
starts. If a current seed exists, it is saved and restored upon exit. If NULL
is given, the
function will not save and restore the seed.
individualTOMInfo
Optional data for TOM matrices in individual data sets. This object is returned by
the function blockwiseIndividualTOMs
. If not given, appropriate topological overlaps will be
calculated u useIndivTOMSubset
If individualTOMInfo
is given, this argument allows to only select a subset
of the individual set networks contained in individualTOMInfo
. It should be a numeric vector giving the
indices of the individual sets to be used. Note
corType
character string specifying the correlation to be used. Allowed values are (unique
abbreviations of) "pearson"
and "bicor"
, corresponding to Pearson and bidweight
midcorrelation, respectively. Missing values are handled using t
maxPOutliers
only used for corType=="bicor"
. Specifies the maximum percentile of data
that can be considered outliers on either
side of the median separately. For each side of the median, if
higher percentile than maxPOutliers
is conside
quickCor
real number between 0 and 1 that controls the handling of missing data in the
calculation of correlations. See details.
pearsonFallback
Specifies whether the bicor calculation, if used, should revert to Pearson when
median absolute deviation (mad) is zero. Recongnized values are (abbreviations of)
"none", "individual", "all"
. If set to
"none"
, zero mad will re
cosineCorrelation
logical: should the cosine version of the correlation calculation be used? The
cosine calculation differs from the standard one in that it does not subtract the mean.
power
soft-thresholding power for network construction.
networkType
network type. Allowed values are (unique abbreviations of) "unsigned"
,
"signed"
, "signed hybrid"
. See adjacency
. checkPower
logical: should basic sanity check be performed on the supplied power
? If
you would like to experiment with unusual powers, set the argument to FALSE
and proceed with
caution.
TOMType
one of "none"
, "unsigned"
, "signed"
. If "none"
, adjacency
will be used for clustering. If "unsigned"
, the standard TOM will be used (more generally, TOM
function will receive the adjacency
TOMDenom
a character string specifying the TOM variant to be used. Recognized values are
"min"
giving the standard TOM described in Zhang and Horvath (2005), and "mean"
in which
the min
function in the denominator is repl
saveIndividualTOMs
logical: should individual TOMs be saved to disk for later use?
individualTOMFileNames
character string giving the file names to save individual TOMs into. The
following tags should be used to make the file names unique for each set and block: %s
will be
replaced by the set number; %N
will be replaced by the set
networkCalibration
network calibration method. One of "single quantile", "full quantile", "none"
(or a unique abbreviation of one of them).
calibrationQuantile
if networkCalibration
is "single quantile"
,
topological overlaps (or adjacencies if
TOMs are not computed) will be scaled such that their calibrationQuantile
quantiles will agree.
sampleForCalibration
if TRUE
, calibration quantiles will be determined from a sample of network
similarities. Note that using all data can double the memory footprint of the function and the function
may fail.
sampleForCalibrationFactor
determines the number of samples for calibration: the number is
1/calibrationQuantile * sampleForCalibrationFactor
. Should be set well above 1 to ensure accuracy of the
sampled quantile.
getNetworkCalibrationSamples
logical: should samples used for TOM calibration be saved for future analysis?
This option is only available when sampleForCalibration
is TRUE
.
consensusQuantile
quantile at which consensus is to be defined. See details.
useMean
logical: should the consensus be determined from a (possibly weighted) mean across the
data sets rather than a quantile?
setWeights
Optional vector (one component per input set) of weights to be used for weighted mean
consensus. Only used when useMean
above is TRUE
.
saveConsensusTOMs
logical: should the consensus topological overlap matrices for each block be saved
and returned?
consensusTOMFileNames
character string containing the file namefiles containing the
consensus topological overlaps. The tag %b
will be replaced by the block number. If the resulting file
names are non-unique (for example, because the user gives a file name witho
useDiskCache
should calculated network similarities in individual sets be temporarilly saved
to disk? Saving to disk is somewhat slower than keeping all data in memory, but for large blocks and/or
many sets the memory footprint may be too big.
chunkSize
network similarities are saved in smaller chunks of size chunkSize
.
cacheBase
character string containing the desired name for the cache files. The actual file
names will consists of cacheBase
and a suffix to make the file names unique.
consensusTOMInfo
optional list summarizing consensus TOM, output of consensusTOM
. It
contains information about pre-calculated consensus TOM. Supplying this argument replaces TOM calculation,
so none of the individua deepSplit
integer value between 0 and 4. Provides a simplified control over how sensitive
module detection should be to module splitting, with 0 least and 4 most sensitive. See
cutreeDynamic
for detectCutHeight
dendrogram cut height for module detection. See
cutreeDynamic
for more details. minModuleSize
minimum module size for module detection. See
cutreeDynamic
for more details. checkMinModuleSize
logical: should sanity checks be performed on minModuleSize
?
maxCoreScatter
maximum scatter of the core for a branch to be a cluster, given as the fraction
of cutHeight
relative to the 5th percentile of joining heights. See
cutreeDynamic
for more minGap
minimum cluster gap given as the fraction of the difference between cutHeight
and
the 5th percentile of joining heights. See cutreeDynamic
for more details. maxAbsCoreScatter
maximum scatter of the core for a branch to be a cluster given as absolute
heights. If given, overrides maxCoreScatter
. See cutreeDynamic
for more details. minAbsGap
minimum cluster gap given as absolute height difference. If given, overrides
minGap
. See cutreeDynamic
for more details. minSplitHeight
Minimum split height given as the fraction of the difference between
cutHeight
and the 5th percentile of joining heights. Branches merging below this height will
automatically be merged. Defaults to zero but is used only if minAbsSpli
minAbsSplitHeight
Minimum split height given as an absolute height.
Branches merging below this height will automatically be merged. If not given (default), will be determined
from minSplitHeight
above.
useBranchEigennodeDissim
Logical: should branch eigennode (eigengene) dissimilarity be considered
when merging branches in Dynamic Tree Cut?
minBranchEigennodeDissim
Minimum consensus branch eigennode (eigengene) dissimilarity for
branches to be considerd separate. The branch eigennode dissimilarity in individual sets
is simly 1-correlation of the
eigennodes; the consensus is defined as quantile with probability
stabilityLabels
Optional matrix of cluster labels that are to be used for calculating branch
dissimilarity based on split stability. The number of rows must equal the number of genes in
multiExpr
; the number of columns (clusterings) is arbitrary. See
minStabilityDissim
Minimum stability dissimilarity criterion for two branches to be considered
separate. Should be a number between 0 (essentially no dissimilarity required) and 1 (perfect dissimilarity
or distinguishability based on stabilityLabels
). See
<
pamStage
logical. If TRUE, the second (PAM-like) stage of module detection will be performed.
See cutreeDynamic
for more details. pamRespectsDendro
Logical, only used when pamStage
is TRUE
.
If TRUE
, the PAM stage will
respect the dendrogram in the sense an object can be PAM-assigned only to clusters that lie below it on
the branch that the object is merged in
reassignThresholdPS
per-set p-value ratio threshold for reassigning genes between modules.
See Details.
trimmingConsensusQuantile
a number between 0 and 1 specifying the consensus quantile used for kME
calculation that determines module trimming according to the arguments below.
minCoreKME
a number between 0 and 1. If a detected module does not have at least
minModuleKMESize
genes with eigengene connectivity at least minCoreKME
, the module is
disbanded (its genes are unlabeled and returned to the pool of genes wa
minCoreKMESize
see minCoreKME
above.
minKMEtoStay
genes whose eigengene connectivity to their module eigengene is lower than
minKMEtoStay
are removed from the module.
impute
logical: should imputation be used for module eigengene calculation? See
moduleEigengenes
for more details. trapErrors
logical: should errors in calculations be trapped?
equalizeQuantilesForModuleMerging
Logical: equalize quantiles of the module eigengene networks
before module merging? If TRUE
, the quantiles of the eigengene correlation matrices (interpreted as a
single vectors of non-redundant components) will be equalized across the inpu
quantileSummaryForModuleMerging
One of "mean"
or "median"
.
If quantile equalization of the module eigengene networks is
performed, the resulting "normal" quantiles will be given by this function of the corresponding quantiles
across the input data sets.
mergeCutHeight
dendrogram cut height for module merging.
mergeConsensusQuantile
consensus quantile for module merging. See mergeCloseModules
for
details.
numericLabels
logical: should the returned modules be labeled by colors (FALSE
), or by
numbers (TRUE
)?
nThreads
non-negative integer specifying the number of parallel threads to be used by certain
parts of correlation calculations. This option only has an effect on systems on which a POSIX thread
library is available (which currently includes Linux and Mac OSX, b
verbose
integer level of verbosity. Zero means silent, higher values make the output
progressively more and more verbose.
indent
indentation for diagnostic messages. Zero means no indentation, each unit adds
two spaces.
...
Other arguments. At present these can include reproduceBranchEigennodeQuantileError
that
instructs the function to reproduce a bug in branch eigennode dissimilarity calculations for purposes if
reproducing old reults.