preservationNetworkConnectivity: Network preservation calculations

Description

This function calculates several measures of gene network preservation. Given gene expression data in several individual data sets, it calculates the individual adjacency matrices, forms the preservation network and finally forms several summary measures of adjacency preservation for each node (gene) in the network.

Usage

preservationNetworkConnectivity(
   multiExpr,
   useSets = NULL, useGenes = NULL,
   corFnc = "cor", corOptions = "use='p'",
   networkType = "unsigned",
   power = 6,
   sampleLinks = NULL, nLinks = 5000,
   blockSize = 1000,
   setSeed = 12345,
   weightPower = 2,
   verbose = 2, indent = 0)

Value

A list with the following components:

pairwise: a matrix with rows corresponding to genes and columns to unique pairs of given sets, giving the pairwise preservation of the adjacencies connecting the gene to all other genes.
complete: a vector with one entry for each input gene containing the complete mean preservation of the adjacencies connecting the gene to all other genes.
pairwiseWeighted: a matrix with rows corresponding to genes and columns to unique pairs of given sets, giving the pairwise weighted preservation of the adjacencies connecting the gene to all other genes.
completeWeighted: a vector with one entry for each input gene containing the complete weighted mean preservation of the adjacencies connecting the gene to all other genes.
pairwiseHyperbolic: a matrix with rows corresponding to genes and columns to unique pairs of given sets, giving the pairwise hyperbolic preservation of the adjacencies connecting the gene to all other genes.
completeHyperbolic: a vector with one entry for each input gene containing the complete mean hyperbolic preservation of the adjacencies connecting the gene to all other genes.
pairwiseWeightedHyperbolic: a matrix with rows corresponding to genes and columns to unique pairs of given sets, giving the pairwise weighted hyperbolic preservation of the adjacencies connecting the gene to all other genes.
completeWeightedHyperbolic: a vector with one entry for each input gene containing the complete weighted hyperbolic mean preservation of the adjacencies connecting the gene to all other genes.

Arguments

multiExpr: expression data in the multi-set format (see checkSets). A vector of lists, one per set. Each set must contain a component data that contains the expression data, with rows corresponding to samples and columns to genes or probes.
useSets: optional specification of sets to be used for the preservation calculation. Defaults to using all sets.
useGenes: optional specification of genes to be used for the preservation calculation. Defaults to all genes.
corFnc: character string containing the name of the function to calculate correlation. Suggested functions include "cor" and "bicor".
corOptions: further argument to the correlation function.
networkType: a character string encoding network type. Recognized values are (unique abbreviations of) "unsigned", "signed", and "signed hybrid".
power: soft thresholding power for network construction. Should be a number greater than 1.
sampleLinks: logical: should network connections be sampled (TRUE) or should all connections be used systematically (FALSE)?
nLinks: number of links to be sampled. Should be set such that nLinks * nNeighbors be several times larger than the number of genes.
blockSize: correlation calculations will be split into square blocks of this size, to prevent running out of memory for large gene sets.
setSeed: seed to be used for sampling, for repeatability. If a seed already exists, it is saved before the sampling starts and restored upon exit.
weightPower: power with which higher adjacencies will be weighted in weighted means
verbose: integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose.
indent: indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces.

Author

Peter Langfelder

Details

The preservation network is formed from adjacencies of compared sets. For 'complete' preservations, all given sets are compared at once; for 'pairwise' preservations, the sets are compared in pairs. Unweighted preservations are simple mean preservations for each node; their weighted counterparts are weighted averages in which a preservation of adjacencies \(A^{(1)}_{ij}\) and \(A^{(2)}_{ij}\) of nodes \(i,j\) between sets 1 and 2 is weighted by \([ (A^{(1)}_{ij} + A^{(2)}_{ij} )/2]^weightPower\). The hyperbolic preservation is based on \(tanh[( max - min)/(max+min)^2]\), where \(max\) and \(min\) are the componentwise maximum and minimum of the compared adjacencies, respectively.

References

Langfelder P, Horvath S (2007) Eigengene networks for studying the relationships between co-expression modules. BMC Systems Biology 2007, 1:54