The anova.manyany
function returns a table summarising the statistical significance
of a fitted manyany model under the alternative hypothesis (object2
) as compared
to a fit under the null hypothesis (object
). Typically the alternative model
is nested in the null although it doesn't need to be (but consider seriously if what
you are doing makes sense if they are not nested).
This function is quite computationally intensive, and a little fussy - it is an early
version we hope to improve on. Feedback welcome!
This function behaves a lot like anova.manyglm
, the most conspicuous differences
being in flexibility and computation time. Since this function is based on manyany
,
it offers much greater flexibility in terms of types of models that can be fitted (most
fixed effects model with predict
and family
arguments could be accommodated).
For information on the different types of data that can be modelled using manyany, see
manyany
.
However this flexibility comes at considerable cost in terms of computation time, and the
default nBoot
has been set to 99 to reflect this (although rerunning at 999 is
recommended). Other more cosmetic differences from anova.manyglm
are that
two and only two models can be supplied as input here; adjusted univariate P-values
are not yet implemented; and the range of test statistics and resampling algorithms is
more limited. All test statistics constructed here are sum-of-likelihood ratio statistics
as in Warton et al (2012), and the resampling method used here is the PIT-trap (short
for 'probability integral transform residual bootstrap', Warton et al 2017).
To check model assumptions, use plot.manyany
.
The block
argument allows for block resampling, such that valid inferences can
be made across independent blocks of correlated sets of observations.
For example, if data have multiple rows of records for each site, e.g. multi-species
data with entries for different species on different rows, you can use your site ID
variable as the block argument to resample sites, for valid cross-site inferences despite
within-site species correlation. Well, valid assuming sites are independent. You could
do similarly for a repeated measures design to make inferences robust to temporal autocorrelation.
Note that block
needs to be balanced, e.g. equal number of species entries for
each site (i.e. include rows for zero abundances too).
The anova.manyany
function is designed specifically for high-dimensional data
(that, is when the number of variables p is not small compared to the number of observations
N). In such instances a correlation matrix is computationally intensive to estimate
and is numerically unstable, so by default the test statistic is calculated assuming
independence of variables. Note however that the resampling scheme used ensures that
the P-values are approximately correct even when the independence assumption is not
satisfied.
Rather than stopping after testing for multivariate effects, it is often of interest
to find out which response variables express significant effects. Univariate statistics
are required to answer this question, and these are reported if requested. Setting p.uni="unadjusted"
returns resampling-based univariate P-values for all effects as well as the multivariate
P-values, if composition=FALSE
. There are currently no univariate P-value options
when composition=TRUE
(it's not entirely clear how such P-values should be obtained)
and if univariate P's are of interest why not rerun the model with composition=FALSE
.
A current limitation of the function is that composition
needs to be set to
the same value in each manyany object being compared - it is not currently possible
to compare models with and without a compositional term in them.