multiData.eigengeneSignificance: Eigengene significance across multiple sets

Description

This function calculates eigengene significance and the associated significance statistics (p-values, q-values etc) across several data sets.

Usage

multiData.eigengeneSignificance(
  multiData, multiTrait, 
  moduleLabels, multiEigengenes = NULL, 
  useModules = NULL, 
  corAndPvalueFnc = corAndPvalue, corOptions = list(), 
  corComponent = "cor", 
  getQvalues = FALSE, setNames = NULL, 
  excludeGrey = TRUE, greyLabel = ifelse(is.numeric(moduleLabels), 0, "grey"))

Arguments

multiData

Expression data (or other data) in multi-set format (see checkSets). A vector of lists; in each list there must be a component named data whose content is a matrix or dataframe or array of dimension 2.

multiTrait

Trait or ourcome data in multi-set format. Only one trait is allowed; consequesntly, the data component of each component list can be either a vector or a data frame (matrix, array of dimension 2).

moduleLabels

Module labels: one label for each gene in multiExpr.

multiEigengenes

Optional eigengenes of modules specified in moduleLabels. If not given, will be calculated from multiExpr.

useModules

Optional specification of module labels to which the analysis should be restricted. This could be useful if there are many modules, most of which are not interesting. Note that the "grey" module cannot be used with useModules.

corAndPvalueFnc

Function that calculates associations between expression profiles and eigengenes. See details.

corOptions

List giving additional arguments to function corAndPvalueFnc. See details.

corComponent

Name of the component of output of corAndPvalueFnc that contains the actual correlation.

getQvalues

logical: should q-values (estimates of FDR) be calculated?

setNames

names for the input sets. If not given, will be taken from names(multiExpr). If those are NULL as well, the names will be "Set_1", "Set_2", ....

excludeGrey

logical: should the grey module be excluded from the kME tables? Since the grey module is typically not a real module, it makes little sense to report kME values for it.

greyLabel

label that labels the grey module.

Value

A list containing the following components. Each component is a matrix in which the rows correspond to module eigengenes and columns to data sets. Row and column names are set appropriately.

eigengeneSignificance

Module eigengene significance.

p.value

p-values (returned by corAndPvalueFnc).

q.value

q-values corresponding to the p-values above. Only returned in input getWvalues is TRUE.

Z statistics (if returned by corAndPvalueFnc).

nObservations

Number of non-missing observations in each correlation/p-value.

Details

This is a convenience function that calculates module eigengene significances (i.e., correlations of module eigengenes with a given trait) across all sets in a multi-set analysis. Also returned are p-values, Z scores, numbers of present (i.e., non-missing) observations for each significance, and optionally the q-values (false discovery rates) corresponding to the p-values.

The function corAndPvalueFnc is currently is expected to accept arguments x (gene expression profiles) and y (eigengene expression profiles). Any additional arguments can be passed via corOptions.

The function corAndPvalueFnc should return a list which at the least contains (1) a matrix of associations of genes and eigengenes (this component should have the name given by corComponent), and (2) a matrix of the corresponding p-values, named "p" or "p.value". Other components are optional but for full functionality should include (3) nObs giving the number of observations for each association (which is the number of samples less number of missing data - this can in principle vary from association to association), and (4) Z giving a Z static for each observation. If these are missing, nObs is calculated in the main function, and calculations using the Z statistic are skipped.