This function calculates several measures of fuzzy module membership in hiearchical consensus modules.
hierarchicalConsensusKME(
multiExpr,
moduleLabels,
multiEigengenes = NULL,
consensusTree,
signed = TRUE,
useModules = NULL,
metaAnalysisWeights = NULL,
corAndPvalueFnc = corAndPvalue, corOptions = list(),
corComponent = "cor", getFDR = FALSE,
useRankPvalue = TRUE,
rankPvalueOptions = list(calculateQvalue = getFDR, pValueMethod = "scale"),
setNames = NULL, excludeGrey = TRUE,
greyLabel = if (is.numeric(moduleLabels)) 0 else "grey",
reportWeightType = NULL,
getOwnModuleZ = TRUE,
getBestModuleZ = TRUE,
getOwnConsensusKME = TRUE,
getBestConsensusKME = TRUE,
getAverageKME = FALSE,
getConsensusKME = TRUE, getMetaP = FALSE,
getMetaFDR = getMetaP && getFDR,
getSetKME = TRUE,
getSetZ = FALSE,
getSetP = FALSE,
getSetFDR = getSetP && getFDR,
includeID = TRUE,
additionalGeneInfo = NULL,
includeWeightTypeInColnames = TRUE)
Expression data in the multi-set format (see checkSets
). A vector of
lists, one per set. Each set must contain a component data
that contains the expression data, with
rows corresponding to samples and columns to genes or probes.
A vector with one entry per column (gene or probe) in multiExpr
, giving the module labels.
Optional specification of module eigengenes of the modules (moduleLabels
) in data sets within
multiExpr
. If not given, will be calculated.
A list specifying the consensus calculation. See details.
Logical: should module membership be considered singed? Signed membership should be used for signed (including signed hybrid) networks and means that negative module membership means the gene is not a member of the module. In other words, in signed networks negative kME values are not considered significant and the corresponding p-values will be one-sided. In unsigned networks, negative kME values are considered significant and the corresponding p-values will be two-sided.
Optional vector specifying which modules should be used. Defaults to all modules except the unassigned module.
Optional specification of meta-analysis weights for each input set. If given, must be a numeric vector
of length equal the number of input data sets (i.e., length(multiExpr)
). These weights will be used
in addition to constant weights and weights proportional to number of samples (observations) in each set.
Function that calculates associations between expression profiles and eigengenes. See details.
List giving additional arguments to function corAndPvalueFnc
. See details.
Name of the component of output of corAndPvalueFnc
that contains the actual correlation.
Logical: should FDR be calculated?
Logical: should the rankPvalue
function be used to obtain alternative
meta-analysis statistics?
Additional options for function rankPvalue
. These include
na.last
(default "keep"
), ties.method
(default "average"
),
calculateQvalue
(default copied from input getQvalues
),
and pValueMethod
(default "scale"
).
See the help file for rankPvalue
for full details.
Names for the input sets. If not given, will be taken from names(multiExpr)
. If those are
NULL
as well, the names will be "Set_1", "Set_2", ...
.
logical: should the grey module be excluded from the kME tables? Since the grey module is typically not a real module, it makes little sense to report kME values for it.
label that labels the grey module.
One of "equal", "rootDoF", "DoF", "user"
. Indicates which of the weights should be reported in the
output. If not given, all available weight types will be reported; this always includes "equal",
"rootDoF", "DoF"
, while "user"
weights are reported if metaAnalysisWeights
above is given.
Logical: should meta-analysis Z statistic in own module be returned as a column of the output?
Logical: should highest meta-analysis Z statistic across all modules and the corresponding module be returned as columns of the output?
Logical: should consensus KME (eigengene-based connectivity) statistic in own module be returned as a column of the output?
Logical: should highest consensus KME across all modules and the corresponding module be returned as columns of the output?
Logical: Should average KME be calculated?
Logical: should consensus KME be calculated?
Logical: should meta-analysis p-values corresponding to the KME meta-analysis Z statistics be calculated?
Logical: should FDR estimates for the meta-analysis p-values corresponding to the KME meta-analysis Z statistics be calculated?
Logical: should KME values for individual sets be returned?
Logical: should Z statistics corresponding to KME for individual sets be returned?
Logical: should p values corresponding to KME for individual sets be returned?
Logical: should FDR estimates corresponding to KME for individual sets be returned?
Logical: should gene ID (taken from column names of multiExpr
) be included as the first column in
the output?
Optional data frame with rows corresponding to genes in multiExpr
that should be included as part of
the output.
Logical: should weight type ("equal", "rootDoF", "DoF", "user"
) be included in appropriate
meta-analysis column names?
Data frame with the following components, some of which may be missing depending on input options (for easier readability the order here is not the same as in the actual output):
Gene ID, taken from the column names of the first input data set
If given, a copy of additionalGeneInfo.
Meta-analysis Z statistic for membership in assigned module.
Maximum meta-analysis Z statistic for membership across all modules.
Module in which the maximum meta-analysis Z statistic is attained.
Consensus KME in assigned module.
Maximum consensus KME across all modules.
Module in which the maximum consensus KME is attained.
Consensus kME (that is, the requested quantile of the kMEs in the
individual data sets)in each module for each gene across the input data
sets. The module labels (here 1, 2, etc.) correspond to those in moduleLabels
.
Average kME in each module for each gene across the input data sets.
Weighted average kME in each module for each gene across the input data sets. The weight of each data set is proportional to the square root of the number of samples in the set.
Weighted average kME in each module for each gene across the input data sets. The weight of each data set is proportional to number of samples in the set.
(Only present if input metaAnalysisWeights
is non-NULL.)
Weighted average kME in each module for each gene across the
input data sets. The weight of each data set is given in metaAnalysisWeights
.
Meta-analysis Z statistic for kME in each module,
obtained by weighing the Z scores in each set equally. Only returned if the function corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
Meta-analysis Z statistic for kME in each module,
obtained by weighing the Z scores in each set by the square root of the number of
samples. Only returned if the function corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
Meta-analysis Z statistic for kME in each module,
obtained by weighing the Z scores in each set by the number of
samples. Only returned if the function corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
Meta-analysis Z statistic for kME in each module,
obtained by weighing the Z scores in each set by metaAnalysisWeights
.
Only returned if metaAnalysisWeights
is non-NULL and the function corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
p-values obtained from the equal-weight meta-analysis Z statistics. Only returned if the function
corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
p-values obtained from the meta-analysis Z statistics with weights proportional to the square root of the
number of samples. Only returned if the function
corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
p-values obtained from the degree-of-freedom weight meta-analysis Z statistics. Only returned if the function
corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
p-values obtained from the user-supplied weight meta-analysis Z statistics. Only returned if
metaAnalysisWeights
is non-NULL and the function
corAndPvalueFnc
returns the Z statistics corresponding to the correlations.
q-values obtained from the equal-weight meta-analysis p-values. Only present if
getQvalues
is TRUE
and the function corAndPvalueFnc
returns the Z statistics corresponding to the kME values.
q-values obtained from the meta-analysis p-values with weights proportional to the square root of the
number of samples. Only present if
getQvalues
is TRUE
and the function corAndPvalueFnc
returns the Z statistics corresponding to the kME values.
q-values obtained from the degree-of-freedom weight meta-analysis p-values. Only present if
getQvalues
is TRUE
and the function corAndPvalueFnc
returns the Z statistics corresponding to the kME values.
q-values obtained from the user-specified weight meta-analysis p-values. Only present if
metaAnalysisWeights
is non-NULL,
getQvalues
is TRUE
and the function corAndPvalueFnc
returns the Z statistics corresponding to the kME values.
The next set of columns contain the results of function rankPvalue and are only present if input useRankPvalue is TRUE. Some columns may be missing depending on the options specified in rankPvalueOptions. We explicitly list columns that are based on weighing each set equally; names of these columns carry the suffix .equalWeights
This is the minimum between pValueLowRank and pValueHighRank, i.e. min(pValueLow, pValueHigh)
Asymptotic p-value for observing a consistently low value based on the rank method.
Asymptotic p-value for observing a consistently low value across the columns of datS based on the rank method.
This is the minimum between pValueLowScale and pValueHighScale, i.e. min(pValueLow, pValueHigh)
Asymptotic p-value for observing a consistently low value across the columns of datS based on the Scale method.
Asymptotic p-value for observing a consistently low value across the columns of datS based on the Scale method.
local false discovery rate (q-value) corresponding to the p-value pValueExtremeRank
local false discovery rate (q-value) corresponding to the p-value pValueLowRank
local false discovery rate (q-value) corresponding to the p-value pValueHighRank
local false discovery rate (q-value) corresponding to the p-value pValueExtremeScale
local false discovery rate (q-value) corresponding to the p-value pValueLowScale
local false discovery rate (q-value) corresponding to the p-value pValueHighScale
Analogous columns corresponding to weighing individual sets by the square root of the number of
samples, by number of samples, and by user weights (if given). The corresponding column name suffixes are
.RootDoFWeights
, .DoFWeights
, and .userWeights
.
The following set of columns summarize kME in individual input data sets.
kME values for each gene in each module in each given data set.
p-values corresponding to kME values for each gene in each module in each given data set.
q-values corresponding to
kME values for each gene in each module in each given data set. Only returned if getQvalues
is
TRUE
.
Z statistics corresponding to
kME values for each gene in each module in each given data set. Only present if the function
corAndPvalueFnc
returns the Z statistics corresponding to the kME values.
This function calculates several measures of (hierarchical) consensus KME (eigengene-based intramodular connectivity or fuzzy module membership) for all genes in all modules.
First, it calculates the meta-analysis Z statistics for correlations between genes and module eigengenes; this is known as the consensus module membership Z statistic. The meta-analysis weights can be specified by the user either explicitly or implicitly ("equal", "RootDoF" or "DoF").
Second, it can calculate the consensus KME, i.e., the hierarchical consensus of the KMEs (correlations with
eigengenes) across the individual sets. The consensus calculation is specified in the argument
consensusTree
;
typically, the consensusTree
used here will be the same as the one used for the actual consensus
network construction and module identification.
See newConsensusTree
for details on how to specify consensus trees.
Third, the function can also calculate the (weighted) average KME using the meta-analysis weights; the average KME can be interpreted as the meta-analysis of the KMEs in the individual sets. This is related to but somewhat distinct from the meta-analysis Z statistics.
In addition to these, optional output also includes, for each gene, KME values in the module to which the gene is assigned as well as the maximum KME values and modules for which the maxima are attained. For most genes, the assigned module will be the one with highest KME values, but for some genes the assigned module and module of maximum KME may be different.
The function corAndPvalueFnc
is currently
is expected to accept arguments x
(gene expression profiles), y
(eigengene expression
profiles), and alternative
with possibilities at least "greater", "two.sided"
.
Any additional arguments can be passed via corOptions
.
The function corAndPvalueFnc
should return a list which at the least contains (1) a matrix
of associations of genes and eigengenes (this component should have the name given by corComponent
),
and (2) a matrix of the corresponding p-values, named "p" or "p.value". Other components are optional but
for full functionality should include
(3) nObs
giving the number of observations for each association (which is the number of samples less
number of missing data - this can in principle vary from association to association), and (4) Z
giving a Z static for each observation. If these are missing, nObs
is calculated in the main
function, and calculations using the Z statistic are skipped.
signedKME
for eigengene based connectivity in a single data set.
corAndPvalue
, bicorAndPvalue
for two alternatives for calculating correlations and the
corresponding p-values and Z scores. Both can be used with this function.
newConsensusTree
for more details on hierarchical consensus trees and calculations.