The intention of this analysis is to validate that there exists no sequence of replicates
(for all possible combination of replicates) that results in a non-significant pattern,
when the initial pattern with combined replicates was shown to be significant.
A small Example:
Assume PhyloExpressionSet stores 3 developmental stages with 3 replicates measured for each stage.
The 9 replicates in total are denoted as: 1.1, 1.2, 1.3, 2.1, 2.2, 2.3, 3.1, 3.2, 3.3. Now the function computes the
statistical significance of each pattern derived by the corresponding combination of replicates, e.g.
1.1, 2.1, 3.1 -> p-value for combination 1
1.1, 2.2, 3.1 -> p-value for combination 2
1.1, 2.3, 3.1 -> p-value for combination 3
1.2, 2.1, 3.1 -> p-value for combination 4
1.2, 2.1, 3.1 -> p-value for combination 5
1.2, 2.1, 3.1 -> p-value for combination 6
1.3, 2.1, 3.1 -> p-value for combination 7
1.3, 2.2, 3.1 -> p-value for combination 8
1.3, 2.3, 3.1 -> p-value for combination 9
This procedure yields 27 p-values for the \(3^3\) (\(n_stages^n_replicates\)) replicate combinations.
Note, that in case you have a large amount of stages/experiments and a large amount of replicates
the computation time will increase by \(n_stages^n_replicates\). For 11 stages and 4 replicates, 4^11 = 4194304 p-values have to be computed.
Each p-value computation itself is based on a permutation test running with 1000 or more permutations. Be aware that this might take some time.
The p-value vector returned by this function can then be used to plot the p-values to see
whether an critical value \(\alpha\) is exeeded or not (e.g. \(\alpha = 0.05\)).
The function receives a standard PhyloExpressionSet or DivergenceExpressionSet object and a vector storing the number of replicates present in each stage or experiment. Based on these arguments the function computes all possible replicate combinations using the expand.grid
function and performs a permutation test (either a FlatLineTest
for each replicate combination. The permutation parameter of this function specifies the number of permutations that shall be performed for each permutation test. When all p-values are computed, a numeric vector storing the corresponding p-values for each replicate combination is returned.
In other words, for each replicate combination present in the PhyloExpressionSet or DivergenceExpressionSet object, the TAI or TDI pattern of the corresponding replicate combination is tested for its statistical significance based on the underlying test statistic.
This function is also able to perform all computations in parallel using multicore processing. The underlying statistical tests are written in C++ and optimized for fast computations.