The methods pvalue
, midpvalue
, pvalue_interval
and
size
compute the \(p\)-value, mid-\(p\)-value, \(p\)-value
interval and test size, respectively.
For pvalue()
, the global \(p\)-value (method = "global"
) is
returned by default and is given with an associated 99% confidence interval
when resampling is used to determine the null distribution (which for maximum
statistics may be true even in the asymptotic case).
The familywise error rate (FWER) is always controlled under the global null
hypothesis, i.e., in the weak sense, implying that the smallest
adjusted \(p\)-value is valid without further assumptions. Control of the
FWER under any partial configuration of the null hypotheses, i.e., in the
strong sense, as is typically desired for multiple tests and
comparisons, requires that the subset pivotality condition holds
(Westfall and Young, 1993, pp. 42--43; Bretz, Hothorn and Westfall, 2011,
pp. 136--137). In addition, for methods based on the joint distribution of
the test statistics, failure of the joint exchangeability assumption
(Westfall and Troendle, 2008; Bretz, Hothorn and Westfall, 2011, pp. 129--130)
may cause excess Type I errors.
Assuming subset pivotality, single-step or free step-down
adjusted \(p\)-values using max-\(T\) procedures are obtained by setting
method
to "single-step"
or "step-down"
, respectively. In
both cases, the distribution
argument specifies whether the adjustment
is based on the joint distribution ("joint"
) or the marginal
distributions ("marginal"
) of the test statistics. For procedures
based on the marginal distributions, Bonferroni- or Šidák-type
adjustment can be specified through the type
argument by setting it to
"Bonferroni"
or "Sidak"
, respectively.
The \(p\)-value adjustment procedures based on the joint distribution of the
test statistics fully utilizes distributional characteristics, such as
discreteness and dependence structure, whereas procedures using the marginal
distributions only incorporate discreteness. Hence, the joint
distribution-based procedures are typically more powerful. Details regarding
the single-step and free step-down procedures based on the joint
distribution can be found in Westfall and Young (1993); in particular, this
implementation uses Equation 2.8 with Algorithm 2.5 and 2.8, respectively.
Westfall and Wolfinger (1997) provide details of the marginal
distributions-based single-step and free step-down procedures. The
generalization of Westfall and Wolfinger (1997) to arbitrary test statistics,
as implemented here, is given by Westfall and Troendle (2008).
Unadjusted \(p\)-values are obtained using method = "unadjusted"
.
For midpvalue()
, the global mid-\(p\)-value is given with an
associated 99% mid-\(p\) confidence interval when resampling is used to
determine the null distribution. The two-sided mid-\(p\)-value is computed
according to the minimum likelihood method (Hirji et al., 1991).
The \(p\)-value interval \((p_0, p_1]\) obtained by
pvalue_interval()
was proposed by Berger (2000, 2001), where the upper
endpoint \(p_1\) is the conventional \(p\)-value and the mid-point, i.e.,
\(p_{0.5}\), is the mid-\(p\)-value. The lower endpoint \(p_0\)
is the smallest \(p\)-value attainable if no conservatism attributable to
the discreteness of the null distribution is present. The length of the
\(p\)-value interval is the null probability of the observed outcome and
provides a data-dependent measure of conservatism that is completely
independent of the nominal significance level.
For size()
, the test size, i.e., the actual significance level, at the
nominal significance level \(\alpha\) is computed using either the rejection
region corresponding to the \(p\)-value (type = "p-value"
, default)
or the mid-\(p\)-value (type = "mid-p-value"
). The test size is, in
contrast to the \(p\)-value interval, a data-independent measure of
conservatism that depends on the nominal significance level. A test size
smaller or larger than the nominal significance level indicates that the test
procedure is conservative or anti-conservative, respectively, at that
particular nominal significance level. However, as pointed out by Berger
(2001), even when the actual and nominal significance levels are identical,
conservatism may still affect the \(p\)-value.