coClustering.permutationTest: Permutation test for co-clustering

Description

This function calculates permutation Z statistics that measure how different the co-clustering of modules in a reference and test clusterings is from random.

Usage

coClustering.permutationTest(
      clusters.ref, clusters.test, 
      tupletSize = 2, 
      nPermutations = 100, 
      unassignedLabel = 0, 
      randomSeed = 12345, verbose = 0, indent = 0)

Arguments

clusters.ref

Reference input clustering. A vector in which each element gives the cluster label of an object.

clusters.test

Test input clustering. Must be a vector of the same size as cluster.ref.

tupletSize

Co-clutering tuplet size.

nPermutations

Number of permutations to execute. Since the function calculates parametric p-values, a relatively small number of permutations (at least 50) should be sufficient.

unassignedLabel

Optional specification of a clustering label that denotes unassigned objects. Objects with this label are excluded from the calculation.

randomSeed

Random seed for initializing the random number generator. If NULL, the generator is not initialized (useful for calling the function sequentially). The default assures reproducibility.

verbose

If non-zero, function will print out progress messages.

indent

Indentation for progress messages. Each unit adds two spaces.

Value

observedthe observed co-clustering measures for clusters in clusters.ref
Zpermutation Z statics
permuted.meanmeans of the co-clustering measures when the test clustering is permuted
permuted.sdstandard deviations of the co-clustering measures when the test clustering is permuted
permuted.ccvalues of the co-clustering measure for each permutation of the test clustering. A matrix of dimensions (number of permutations)x(number of clusters in reference clustering).

Details

This function performs a permutation test to determine whether observed co-clustering statistics are significantly different from those expected by chance. It returns the observed co-clustering as well as the permutation Z statistic, calculated as (observed - mean)/sd, where mean and sd are the mean and standard deviation of the co-clustering when the test clustering is repeatedly randomly permuted.

References

For example, see Langfelder P, Luo R, Oldham MC, Horvath S (2011) Is My Network Module Preserved and Reproducible? PLoS Comput Biol 7(1): e1001057. Co-clustering is discussed in the Methods Supplement (Supplementary text 1) of that article.

Examples

Run this code

set.seed(1);
  nModules = 5;
  nGenes = 100;
  cl1 = sample(c(1:nModules), nGenes, replace = TRUE);
  cl2 = sample(c(1:nModules), nGenes, replace = TRUE);
  
  cc = coClustering(cl1, cl2)

  # Choose a low number of permutations to make the example fast
  ccPerm = coClustering.permutationTest(cl1, cl2, nPermutations = 20, verbose = 1);

  ccPerm$observed
  ccPerm$Z

  # Combine cl1 and cl2 to obtain clustering that is somewhat similar to cl1:

  cl3 = cl2;
  from1 = sample(c(TRUE, FALSE), nGenes, replace = TRUE);
  cl3[from1] = cl1[from1];

  ccPerm = coClustering.permutationTest(cl1, cl3, nPermutations = 20, verbose = 1);

  # observed co-clustering is higher than before:
  ccPerm$observed

  # Note the high preservation Z statistics:
  ccPerm$Z

Run the code above in your browser using DataLab