Bonferroni Method
By default (i.e., when adjust = "none"
), the function applies the Bonferroni method to the p-values. Letting p_1, p_2, …, p_k denote the individual (one- or two-sided) p-values of the k hypothesis tests to be combined, the combined p-value is then computed with p_c = (1, (p_1, p_2, …, p_k) k).p_c = min(1, min(p_1, p_2, ..., p_k) * k).
The Bonferroni method does not assume that the p-values to be combined are independent. However, if the p-values are not independent, the method can become quite conservative (not reject often enough), depending on the dependence structure among the tests. In this case, one can adjust the method to account for such dependence (to bring the Type I error rate closer to some desired nominal significance level).
Adjustment Based on the Effective Number of Tests
When adjust
is set to "nyholt"
, "liji"
, "gao"
or "galwey"
, the Bonferroni method is adjusted based on an estimate of the effective number of tests (see meff
for details on these methods for estimating the effective number of tests). In this case, argument R
needs to be set to a matrix that reflects the dependence structure among the tests.
There is no general solution for constructing such a matrix, as this depends on the type of test that generated the p-values and the sidedness of these tests. If the p-values are obtained from tests whose test statistics can be assumed to follow a multivariate normal distribution and a matrix is available that reflects the correlations among the test statistics, then the mvnconv
function can be used to convert this correlation matrix into the correlations among the (one- or two-sided) p-values, which can then be passed to the R
argument. See ‘Examples’.
Once the effective number of tests, m, is estimated based on R
using one of the four methods described above, the combined p-value is then computed with p_c = (1, (p_1, p_2, …, p_k) m).p_c = min(1, min(p_1, p_2, ..., p_k) * m).
Alternatively, one can also directly specify the effective number of tests via the m
argument (e.g., if some other method not implemented in the poolr package is used to estimate the effective number of tests). Argument R
is then irrelevant and doesn't need to be specified.
Adjustment Based on an Empirically-Derived Null Distribution
When adjust = "empirical"
, the combined p-value is computed based on an empirically-derived null distribution using pseudo replicates (using the empirical
function). This is appropriate if the test statistics that generated the p-values to be combined can be assumed to follow a multivariate normal distribution and a matrix is available that reflects the correlations among the test statistics (which is specified via the R
argument). In this case, test statistics are repeatedly simulated from a multivariate normal distribution under the joint null hypothesis, converted into one- or two-sided p-values (depending on the side
argument), and the Bonferroni method is applied. Repeating this process size
times yields a null distribution based on which the combined p-value can be computed, or more precisely, estimated, since repeated applications of this method will yield (slightly) different results. To obtain a stable estimate of the combined p-value, size
should be set to a large value (the default is 10000
, but this can be increased for a more precise estimate). If we consider the combined p-value an estimate of the ‘true’ combined p-value that would be obtained for a null distribution of infinite size, we can also construct a 95% (pseudo) confidence interval based on a binomial distribution.
If batchsize
is unspecified, the null distribution is simulated in a single batch, which requires temporarily storing a matrix with dimensions [size,k]
. When size*k
is large, allocating the memory for this matrix might not be possible. Instead, one can specify a batchsize
value, in which case a matrix with dimensions [batchsize,k]
is repeatedly simulated until the desired size of the null distribution has been obtained.
One can also specify a vector for the size
argument, in which case one must also specify a corresponding vector for the threshold
argument. In that case, a stepwise algorithm is used that proceeds as follows. For j = 1, ..., length(size)
,
estimate the combined p-value based on size[j]
if the combined p-value is than threshold[j]
, stop (and report the combined p-value), otherwise go back to 1.
By setting size
to increasing values (e.g., size = c(1000, 10000, 100000)
) and threshold
to decreasing values (e.g., threshold = c(.10, .01, 0)
), one can quickly obtain a fairly accurate estimate of the combined p-value if it is far from significant (e.g., .10), but hone in on a more accurate estimate for a combined p-value that is closer to 0. Note that the last value of threshold
should be 0 (and is forced to be inside of the function), so that the algorithm is guaranteed to terminate (hence, one can also leave out the last value of threshold
, so threshold = c(.10, .01)
would also work in the example above). One can also specify a single threshold
(which is replicated as often as necessary depending on the length of size
).