blockwiseIndividualTOMs(
multiExpr, # Data checking options
checkMissingData = TRUE,
# Blocking options
blocks = NULL,
maxBlockSize = 5000,
randomSeed = 12345,
# Network construction arguments: correlation options
corType = "pearson",
maxPOutliers = 1,
quickCor = 0,
pearsonFallback = "individual",
cosineCorrelation = FALSE,
# Adjacency function options
power = 6,
networkType = "unsigned",
checkPower = TRUE,
# Topological overlap options
TOMType = "unsigned",
TOMDenom = "min",
# Save individual TOMs? If not, they will be returned in the session.
saveTOMs = TRUE,
individualTOMFileNames = "individualTOM-Set%s-Block%b.RData",
# General options
nThreads = 0,
verbose = 2, indent = 0)
checkSets
). A vector of
lists, one per set. Each set must contain a component data
that contains the expression data, with
rows corresponding to smultiExpr
giving the number of the block to which the corresponding geneblocks
above is non-NULL. Otherwise, if the number of genes in datExpr
exceeds maxBlockSize
, genes
will be pre-clustered into blocks whose size shoulNULL
is given, the
function will not save and restore the seed."pearson"
and "bicor"
, corresponding to Pearson and bidweight
midcorrelation, respectively. Missing values are handled using thecorType=="bicor"
. Specifies the maximum percentile of data
that can be considered outliers on either
side of the median separately. For each side of the median, if
higher percentile than maxPOutliers
is considered a"none", "individual", "all"
. If set to
"none"
, zero mad will resul"unsigned"
,
"signed"
, "signed hybrid"
. See adjacency
.power
? If
you would like to experiment with unusual powers, set the argument to FALSE
and proceed with
caution."none"
, "unsigned"
, "signed"
. If "none"
, adjacency
will be used for clustering. If "unsigned"
, the standard TOM will be used (more generally, TOM
function will receive the adjacency a"min"
giving the standard TOM described in Zhang and Horvath (2005), and "mean"
in which
the min
function in the denominator is replacedTRUE
) or returned in the return
value (FALSE
)? Returning calculated TOMs via the return value ay be more convenient bt not always
feasible if the matrices are too big to fit all i%s
will be
replaced by the set number; %N
will be replaced by the set nasaveTOMs
is TRUE
. A matrix of character
strings giving the file names in which each block TOM is saved. Rows correspond to data sets and columns to
blocks.saveTOMs
is FALSE
. A list in which each
component corresponds to one block. Each component is a matrix of dimensions (N times (number of sets)), where
N is the length of a distance structure corresponding to the block. That is, if the block contains n genes,
N=n*(n-1)/2. Each column of the matrix contains the topological overlap of variables in the corresponding set (
and the corresponding block), arranged as a distance structure. Do note however that the topological overlap
is a similarity (not a distance).blocks
was given, its copy; otherwise a vector of length equal number of
genes giving the block label for each gene. Note that block labels are not necessarilly sorted in the
order in which the blocks were processed (since we do not require this for the input blocks
). See
blockOrder
below.multiExpr
) of genes in the corresponding block.checkMissingData
is TRUE
, the output of the function goodSamplesGenesMS
.
A list with components
goodGenes
(logical vector indicating which genes passed the missing data filters), goodSamples
(a list of logical vectors indicating which samples passed the missing data filters in each set), and
allOK
(a logical indicating whether all genes and all samples passed the filters). See
goodSamplesGenesMS
for more details. If checkMissingData
is FALSE
,
goodSamplesAndGenes
contains a list of the same type but indicating that all genes and all samples
passed the missing data filters.blockwiseConsensusModules
.checkMissingData
is TRUE
), or the number of all genes (if checkMissingData
is
FALSE
).blocks
(above), restricted to good genes only.TRUE
) or returned in the
return value (FALSE
)?names
attribute of input multiExpr
.If blocks
is not given and
the number of genes exceeds maxBlockSize
, genes are pre-clustered into blocks using the function
consensusProjectiveKMeans
; otherwise all genes are treated in a single block.
For each block of genes, the network is constructed and (if requested) topological overlap is calculated in each set. The topological overlaps can be saved to disk as RData files, or returned directly within the return value (see below). Note that the matrices can be big and returning them within the return value can quickly exhaust the system's memory. In particular, if the block-wise calculation is necessary, it is nearly certain that returning all matrices via the return value will be impossible.
Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1, Article 17
The blockwise approach is briefly described in the article describing this package,
Langfelder P, Horvath S (2008) "WGCNA: an R package for weighted correlation network analysis". BMC Bioinformatics 2008, 9:559
blockwiseConsensusModules