Learn R Programming

WGCNA (version 1.25-1)

recutBlockwiseTrees: Repeat blockwise module detection from pre-calculated data

Description

Given consensus networks constructed for example using blockwiseModules, this function (re-)detects modules in them by branch cutting of the corresponding dendrograms. If repeated branch cuts of the same gene network dendrograms are desired, this function can save substantial time by re-using already calculated networks and dendrograms.

Usage

recutBlockwiseTrees(
  datExpr,
  goodSamples, goodGenes,
  blocks,
  TOMFiles,
  dendrograms,
  corType = "pearson",
  networkType = "unsigned",
  deepSplit = 2,
  detectCutHeight = 0.995, minModuleSize = min(20, ncol(datExpr)/2 ),
  maxCoreScatter = NULL, minGap = NULL,
  maxAbsCoreScatter = NULL, minAbsGap = NULL,
  pamStage = TRUE, pamRespectsDendro = TRUE,
  minCoreKME = 0.5, minCoreKMESize = minModuleSize/3,
  minKMEtoStay = 0.3,
  reassignThreshold = 1e-6,
  mergeCutHeight = 0.15, impute = TRUE,
  trapErrors = FALSE, numericLabels = FALSE,
  verbose = 0, indent = 0)

Arguments

datExpr
expression data. A data frame in which columns are genes and rows ar samples. NAs are allowed, but not too many.
goodSamples
a logical vector specifying which samples are considered "good" for the analysis. See goodSamplesGenes.
goodGenes
a logical vector with length equal number of genes in multiExpr that specifies which genes are considered "good" for the analysis. See goodSamplesGenes.
blocks
specification of blocks in which hierarchical clustering and module detection should be performed. A numeric vector with one entry per gene of multiExpr giving the number of the block to which the corresponding gene belongs.
TOMFiles
a vector of character strings specifying file names in which the block-wise topological overlaps are saved.
dendrograms
a list of length equal the number of blocks, in which each component is a hierarchical clustering dendrograms of the genes that belong to the block.
corType
character string specifying the correlation to be used. Allowed values are (unique abbreviations of) "pearson" and "bicor", corresponding to Pearson and bidweight midcorrelation, respectively. Missing values are handled using t
networkType
network type. Allowed values are (unique abbreviations of) "unsigned", "signed", "signed hybrid". See adjacency.
deepSplit
integer value between 0 and 4. Provides a simplified control over how sensitive module detection should be to module splitting, with 0 least and 4 most sensitive. See cutreeDynamic for
detectCutHeight
dendrogram cut height for module detection. See cutreeDynamic for more details.
minModuleSize
minimum module size for module detection. See cutreeDynamic for more details.
maxCoreScatter
maximum scatter of the core for a branch to be a cluster, given as the fraction of cutHeight relative to the 5th percentile of joining heights. See cutreeDynamic for more
minGap
minimum cluster gap given as the fraction of the difference between cutHeight and the 5th percentile of joining heights. See cutreeDynamic for more details.
maxAbsCoreScatter
maximum scatter of the core for a branch to be a cluster given as absolute heights. If given, overrides maxCoreScatter. See cutreeDynamic for more details.
minAbsGap
minimum cluster gap given as absolute height difference. If given, overrides minGap. See cutreeDynamic for more details.
pamStage
logical. If TRUE, the second (PAM-like) stage of module detection will be performed. See cutreeDynamic for more details.
pamRespectsDendro
Logical, only used when pamStage is TRUE. If TRUE, the PAM stage will respect the dendrogram in the sense an object can be PAM-assigned only to clusters that lie below it on the branch that the object is merged in
minCoreKME
a number between 0 and 1. If a detected module does not have at least minModuleKMESize genes with eigengene connectivity at least minCoreKME, the module is disbanded (its genes are unlabeled and returned to the pool of genes wa
minCoreKMESize
see minCoreKME above.
minKMEtoStay
genes whose eigengene connectivity to their module eigengene is lower than minKMEtoStay are removed from the module.
reassignThreshold
p-value ratio threshold for reassigning genes between modules. See Details.
mergeCutHeight
dendrogram cut height for module merging.
impute
logical: should imputation be used for module eigengene calculation? See moduleEigengenes for more details.
trapErrors
logical: should errors in calculations be trapped?
numericLabels
logical: should the returned modules be labeled by colors (FALSE), or by numbers (TRUE)?
verbose
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose.
indent
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces.

Value

  • A list with the following components:
  • colorsa vector of color or numeric module labels for all genes.
  • unmergedColorsa vector of color or numeric module labels for all genes before module merging.
  • MEsa data frame containing module eigengenes of the found modules (given by colors).
  • MEsOKlogical indicating whether the module eigengenes were calculated without errors.

Details

For details on blockwise module detection, see blockwiseModules. This function implements the module detection subset of the functionality of blockwiseModules; network construction and clustering must be performed in advance. The primary use of this function is to experiment with module detection settings without having to re-execute long network and clustering calculations whose results are not affected by the cutting parameters. This function takes as input the networks and dendrograms that are produced by blockwiseModules. Working block by block, modules are identified in the dendrogram by the Dynamic Hybrid Tree Cut algorithm. Found modules are trimmed of genes whose correlation with module eigengene (KME) is less than minKMEtoStay. Modules in which fewer than minCoreKMESize genes have KME higher than minCoreKME are disbanded, i.e., their constituent genes are pronounced unassigned. After all blocks have been processed, the function checks whether there are genes whose KME in the module they assigned is lower than KME to another module. If p-values of the higher correlations are smaller than those of the native module by the factor reassignThresholdPS, the gene is re-assigned to the closer module. In the last step, modules whose eigengenes are highly correlated are merged. This is achieved by clustering module eigengenes using the dissimilarity given by one minus their correlation, cutting the dendrogram at the height mergeCutHeight and merging all modules on each branch. The process is iterated until no modules are merged. See mergeCloseModules for more details on module merging.

References

Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17

See Also

blockwiseModules for full module calculation; cutreeDynamic for adaptive branch cutting in hierarchical clustering dendrograms; mergeCloseModules for merging of close modules.