hierarchicalConsensusKME: Calculation of measures of fuzzy module membership (KME) in hierarchical consensus modules

Description

This function calculates several measures of fuzzy module membership in hiearchical consensus modules.

Usage

hierarchicalConsensusKME(
   multiExpr,
   moduleLabels,
   multiWeights = NULL,
   multiEigengenes = NULL,
   consensusTree,
   signed = TRUE,
   useModules = NULL,
   metaAnalysisWeights = NULL,
   corAndPvalueFnc = corAndPvalue, corOptions = list(),
   corComponent = "cor", getFDR = FALSE,
   useRankPvalue = TRUE,
   rankPvalueOptions = list(calculateQvalue = getFDR, pValueMethod = "scale"),
   setNames = names(multiExpr), excludeGrey = TRUE,
   greyLabel = if (is.numeric(moduleLabels)) 0 else "grey",
   reportWeightType = NULL,
   getOwnModuleZ = TRUE,
   getBestModuleZ = TRUE,
   getOwnConsensusKME = TRUE,
   getBestConsensusKME = TRUE,
   getAverageKME = FALSE,
   getConsensusKME = TRUE,
   getMetaColsFor1Set = FALSE,
   getMetaP = FALSE,
   getMetaFDR = getMetaP && getFDR,
   getSetKME = TRUE,
   getSetZ = FALSE,
   getSetP = FALSE,
   getSetFDR = getSetP && getFDR,
   includeID = TRUE,
   additionalGeneInfo = NULL,
   includeWeightTypeInColnames = TRUE)

Arguments

multiExpr

Expression data in the multi-set format (see checkSets). A vector of lists, one per set. Each set must contain a component data that contains the expression data, with rows corresponding to samples and columns to genes or probes.

moduleLabels

A vector with one entry per column (gene or probe) in multiExpr, giving the module labels.

multiWeights

optional observation weights for data in multiExpr, in the same format (and dimensions) as multiExpr. These weights are used in calculation of KME, i.e., the correlation of module eigengenes with data in multiExpr. The module eigengenes are not weighted in this calculation.

multiEigengenes

Optional specification of module eigengenes of the modules (moduleLabels) in data sets within multiExpr. If not given, will be calculated.

consensusTree

A list specifying the consensus calculation. See details.

signed

Logical: should module membership be considered singed? Signed membership should be used for signed (including signed hybrid) networks and means that negative module membership means the gene is not a member of the module. In other words, in signed networks negative kME values are not considered significant and the corresponding p-values will be one-sided. In unsigned networks, negative kME values are considered significant and the corresponding p-values will be two-sided.

useModules

Optional vector specifying which modules should be used. Defaults to all modules except the unassigned module.

metaAnalysisWeights

Optional specification of meta-analysis weights for each input set. If given, must be a numeric vector of length equal the number of input data sets (i.e., length(multiExpr)). These weights will be used in addition to constant weights and weights proportional to number of samples (observations) in each set.

corAndPvalueFnc

Function that calculates associations between expression profiles and eigengenes. See details.

corOptions

List giving additional arguments to function corAndPvalueFnc. See details.

corComponent

Name of the component of output of corAndPvalueFnc that contains the actual correlation.

getFDR

Logical: should FDR be calculated?

useRankPvalue

Logical: should the rankPvalue function be used to obtain alternative meta-analysis statistics?

rankPvalueOptions

Additional options for function rankPvalue. These include na.last (default "keep"), ties.method (default "average"), calculateQvalue (default copied from input getQvalues), and pValueMethod (default "scale"). See the help file for rankPvalue for full details.

setNames

Names for the input sets. If not given, will be taken from names(multiExpr). If those are NULL as well, the names will be "Set_1", "Set_2", ....

excludeGrey

logical: should the grey module be excluded from the kME tables? Since the grey module is typically not a real module, it makes little sense to report kME values for it.

greyLabel

label that labels the grey module.

reportWeightType

One of "equal", "rootDoF", "DoF", "user". Indicates which of the weights should be reported in the output. If not given, all available weight types will be reported; this always includes "equal", "rootDoF", "DoF", while "user" weights are reported if metaAnalysisWeights above is given.

getOwnModuleZ

Logical: should meta-analysis Z statistic in own module be returned as a column of the output?

getBestModuleZ

Logical: should highest meta-analysis Z statistic across all modules and the corresponding module be returned as columns of the output?

getOwnConsensusKME

Logical: should consensus KME (eigengene-based connectivity) statistic in own module be returned as a column of the output?

getBestConsensusKME

Logical: should highest consensus KME across all modules and the corresponding module be returned as columns of the output?

getAverageKME

Logical: Should average KME be calculated?

getConsensusKME

Logical: should consensus KME be calculated?

getMetaColsFor1Set

Logical: should the meta-statistics be returned if the input data only have 1 set? For 1 set, meta- and individual kME values are the same, so meta-columns essentially duplicate individual columns.

getMetaP

Logical: should meta-analysis p-values corresponding to the KME meta-analysis Z statistics be calculated?

getMetaFDR

Logical: should FDR estimates for the meta-analysis p-values corresponding to the KME meta-analysis Z statistics be calculated?

getSetKME

Logical: should KME values for individual sets be returned?

getSetZ

Logical: should Z statistics corresponding to KME for individual sets be returned?

getSetP

Logical: should p values corresponding to KME for individual sets be returned?

getSetFDR

Logical: should FDR estimates corresponding to KME for individual sets be returned?

includeID

Logical: should gene ID (taken from column names of multiExpr) be included as the first column in the output?

additionalGeneInfo

Optional data frame with rows corresponding to genes in multiExpr that should be included as part of the output.

includeWeightTypeInColnames

Logical: should weight type ("equal", "rootDoF", "DoF", "user") be included in appropriate meta-analysis column names?

Value

Data frame with the following components, some of which may be missing depending on input options (for easier readability the order here is not the same as in the actual output):

Gene ID, taken from the column names of the first input data set

If given, a copy of additionalGeneInfo.

Z.kME.inOwnModule

Meta-analysis Z statistic for membership in assigned module.

maxZ.kME

Maximum meta-analysis Z statistic for membership across all modules.

moduleOfMaxZ.kME

Module in which the maximum meta-analysis Z statistic is attained.

consKME.inOwnModule

Consensus KME in assigned module.

maxConsKME

Maximum consensus KME across all modules.

moduleOfMaxConsKME

Module in which the maximum consensus KME is attained.

consensus.kME.1, consensus.kME.2, ...

Consensus kME (that is, the requested quantile of the kMEs in the individual data sets)in each module for each gene across the input data sets. The module labels (here 1, 2, etc.) correspond to those in moduleLabels.

weightedAverage.equalWeights.kME1, weightedAverage.equalWeights.kME2, ...

Average kME in each module for each gene across the input data sets.

weightedAverage.RootDoFWeights.kME1, weightedAverage.RootDoFWeights.kME2, ...

Weighted average kME in each module for each gene across the input data sets. The weight of each data set is proportional to the square root of the number of samples in the set.

weightedAverage.DoFWeights.kME1, weightedAverage.DoFWeights.kME2, ...

Weighted average kME in each module for each gene across the input data sets. The weight of each data set is proportional to number of samples in the set.

weightedAverage.userWeights.kME1, weightedAverage.userWeights.kME2, ...

(Only present if input metaAnalysisWeights is non-NULL.) Weighted average kME in each module for each gene across the input data sets. The weight of each data set is given in metaAnalysisWeights.

meta.Z.equalWeights.kME1, meta.Z.equalWeights.kME2, ...

Meta-analysis Z statistic for kME in each module, obtained by weighing the Z scores in each set equally. Only returned if the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.Z.RootDoFWeights.kME1, meta.Z.RootDoFWeights.kME2, ...

Meta-analysis Z statistic for kME in each module, obtained by weighing the Z scores in each set by the square root of the number of samples. Only returned if the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.Z.DoFWeights.kME1, meta.Z.DoFWeights.kME2, ...

Meta-analysis Z statistic for kME in each module, obtained by weighing the Z scores in each set by the number of samples. Only returned if the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.Z.userWeights.kME1, meta.Z.userWeights.kME2, ...

Meta-analysis Z statistic for kME in each module, obtained by weighing the Z scores in each set by metaAnalysisWeights. Only returned if metaAnalysisWeights is non-NULL and the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.p.equalWeights.kME1, meta.p.equalWeights.kME2, ...

p-values obtained from the equal-weight meta-analysis Z statistics. Only returned if the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.p.RootDoFWeights.kME1, meta.p.RootDoFWeights.kME2, ...

p-values obtained from the meta-analysis Z statistics with weights proportional to the square root of the number of samples. Only returned if the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.p.DoFWeights.kME1, meta.p.DoFWeights.kME2, ...

p-values obtained from the degree-of-freedom weight meta-analysis Z statistics. Only returned if the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.p.userWeights.kME1, meta.p.userWeights.kME2, ...

p-values obtained from the user-supplied weight meta-analysis Z statistics. Only returned if metaAnalysisWeights is non-NULL and the function corAndPvalueFnc returns the Z statistics corresponding to the correlations.

meta.q.equalWeights.kME1, meta.q.equalWeights.kME2, ...

q-values obtained from the equal-weight meta-analysis p-values. Only present if getQvalues is TRUE and the function corAndPvalueFnc returns the Z statistics corresponding to the kME values.

meta.q.RootDoFWeights.kME1, meta.q.RootDoFWeights.kME2, ...

q-values obtained from the meta-analysis p-values with weights proportional to the square root of the number of samples. Only present if getQvalues is TRUE and the function corAndPvalueFnc returns the Z statistics corresponding to the kME values.

meta.q.DoFWeights.kME1, meta.q.DoFWeights.kME2, ...

q-values obtained from the degree-of-freedom weight meta-analysis p-values. Only present if getQvalues is TRUE and the function corAndPvalueFnc returns the Z statistics corresponding to the kME values.

meta.q.userWeights.kME1, meta.q.userWeights.kME2, ...

q-values obtained from the user-specified weight meta-analysis p-values. Only present if metaAnalysisWeights is non-NULL, getQvalues is TRUE and the function corAndPvalueFnc returns the Z statistics corresponding to the kME values.

The next set of columns contain the results of function rankPvalue and are only present if input useRankPvalue is TRUE. Some columns may be missing depending on the options specified in rankPvalueOptions. We explicitly list columns that are based on weighing each set equally; names of these columns carry the suffix .equalWeights

pValueExtremeRank.ME1.equalWeights, pValueExtremeRank.ME2.equalWeights, ...

This is the minimum between pValueLowRank and pValueHighRank, i.e. min(pValueLow, pValueHigh)

pValueLowRank.ME1.equalWeights, pValueLowRank.ME2.equalWeights, ...

Asymptotic p-value for observing a consistently low value based on the rank method.

pValueHighRank.ME1.equalWeights, pValueHighRank.ME2.equalWeights, ...

Asymptotic p-value for observing a consistently low value across the columns of datS based on the rank method.

pValueExtremeScale.ME1.equalWeights, pValueExtremeScale.ME2.equalWeights, ...

This is the minimum between pValueLowScale and pValueHighScale, i.e. min(pValueLow, pValueHigh)

pValueLowScale.ME1.equalWeights, pValueLowScale.ME2.equalWeights, ...

Asymptotic p-value for observing a consistently low value across the columns of datS based on the Scale method.

pValueHighScale.ME1.equalWeights, pValueHighScale.ME2.equalWeights, ...

Asymptotic p-value for observing a consistently low value across the columns of datS based on the Scale method.

qValueExtremeRank.ME1.equalWeights, qValueExtremeRank.ME2.equalWeights, ...

local false discovery rate (q-value) corresponding to the p-value pValueExtremeRank

qValueLowRank.ME1.equalWeights, qValueLowRank.ME2.equalWeights, ...

local false discovery rate (q-value) corresponding to the p-value pValueLowRank

qValueHighRank.ME1.equalWeights, lueHighRank.ME2.equalWeights, ...

local false discovery rate (q-value) corresponding to the p-value pValueHighRank

qValueExtremeScale.ME1.equalWeights, qValueExtremeScale.ME2.equalWeights, ...

local false discovery rate (q-value) corresponding to the p-value pValueExtremeScale

qValueLowScale.ME1.equalWeights, qValueLowScale.ME2.equalWeights, ...

local false discovery rate (q-value) corresponding to the p-value pValueLowScale

qValueHighScale.ME1.equalWeights,qValueHighScale.ME2.equalWeights, ...

local false discovery rate (q-value) corresponding to the p-value pValueHighScale

...

Analogous columns corresponding to weighing individual sets by the square root of the number of samples, by number of samples, and by user weights (if given). The corresponding column name suffixes are .RootDoFWeights, .DoFWeights, and .userWeights.

The following set of columns summarize kME in individual input data sets.

kME1.Set_1, kME1.Set_2, ..., kME2.Set_1, kME2.Set_2, ...

kME values for each gene in each module in each given data set.

p.kME1.Set_1, p.kME1.Set_2, ..., p.kME2.Set_1, p.kME2.Set_2, ...

p-values corresponding to kME values for each gene in each module in each given data set.

q.kME1.Set_1, q.kME1.Set_2, ..., q.kME2.Set_1, q.kME2.Set_2, ...

q-values corresponding to kME values for each gene in each module in each given data set. Only returned if getQvalues is TRUE.

Z.kME1.Set_1, Z.kME1.Set_2, ..., Z.kME2.Set_1, Z.kME2.Set_2, ...

Z statistics corresponding to kME values for each gene in each module in each given data set. Only present if the function corAndPvalueFnc returns the Z statistics corresponding to the kME values.

Details

This function calculates several measures of (hierarchical) consensus KME (eigengene-based intramodular connectivity or fuzzy module membership) for all genes in all modules.

First, it calculates the meta-analysis Z statistics for correlations between genes and module eigengenes; this is known as the consensus module membership Z statistic. The meta-analysis weights can be specified by the user either explicitly or implicitly ("equal", "RootDoF" or "DoF").

Second, it can calculate the consensus KME, i.e., the hierarchical consensus of the KMEs (correlations with eigengenes) across the individual sets. The consensus calculation is specified in the argument consensusTree; typically, the consensusTree used here will be the same as the one used for the actual consensus network construction and module identification. See newConsensusTree for details on how to specify consensus trees.

Third, the function can also calculate the (weighted) average KME using the meta-analysis weights; the average KME can be interpreted as the meta-analysis of the KMEs in the individual sets. This is related to but somewhat distinct from the meta-analysis Z statistics.

In addition to these, optional output also includes, for each gene, KME values in the module to which the gene is assigned as well as the maximum KME values and modules for which the maxima are attained. For most genes, the assigned module will be the one with highest KME values, but for some genes the assigned module and module of maximum KME may be different.

The function corAndPvalueFnc is currently is expected to accept arguments x (gene expression profiles), y (eigengene expression profiles), and alternative with possibilities at least "greater", "two.sided". If weights are given, these are passed to corAndPvalueFnc as argument weights.x. Any additional arguments can be passed via corOptions.

The function corAndPvalueFnc should return a list which at the least contains (1) a matrix of associations of genes and eigengenes (this component should have the name given by corComponent), and (2) a matrix of the corresponding p-values, named "p" or "p.value". Other components are optional but for full functionality should include (3) nObs giving the number of observations for each association (which is the number of samples less number of missing data - this can in principle vary from association to association), and (4) Z giving a Z static for each observation. If these are missing, nObs is calculated in the main function, and calculations using the Z statistic are skipped.

Description

Usage

Arguments

Value

Details

See Also