This function calculates residual sum of squares either via ordinary least squares (OLS) estimation or
phylogenetic least squares (PGLS) estimation for both full and reduced models. Residuals from the reduced model are used
in a randomized residual permutation procedure (RRPP) to find the difference in residual sum of squares (trace of the residual
sums of squares and cross-products matrix, SSCP) over many permutations, thus creating a distribution of sum of squares (SS)
for the parameters that differ between models (Collyer et al. 2015). The SS can be converted to F-values to generate an empirical F-distribution.
A P-value is estimated as the percentile of the observed value in this distribution.
The response matrix 'Y' can be in the form of a two-dimensional data
matrix of dimension (n x [p x k]) or a 3D array (p x k x n). It is assumed that the landmarks have previously
been aligned using Generalized Procrustes Analysis (GPA) [e.g., with gpagen
]. The names specified for the
independent (x) variables in the formula represent one or more
vectors containing continuous data or factors. It is assumed that the order of the specimens in the
shape matrix matches the order of values in the independent variables. Linear model fits (using the lm
function)
can also be input in place of a formula. Arguments for lm
can also be passed on via this function.
The SS calculated is the same as the sum of squared Procrustes distances among specimens, as used as a measure of SS in Procrustes ANOVA (see Goodall 1991).
Procrustes ANOVA, often used in morphometrics applications is equivalent
to distance-based anova designs (Anderson 2001). Unlike procD.lm
, this function is strictly for comparison
of two nested models. (Use of procD.lm
will be more suitable in most cases.)
Effect-sizes (Z-scores) are computed as standard deviates of the statistic chosen for ANOVA (see arguments) or for
pairwise statistic sampling distributions generated, which might be more intuitive for P-values than F-values (see Collyer et al. 2015).
For ANOVA Z-scores, a log-transformation is performed first, to assure a normally distributed sampling distribution.
Pairwise tests have two flavors: 1) tests for differences in group means (based on vector length between
means for pairwise comparisons) and 2) tests for angular differences in slopes between groups. These tests are
similar in concept to trajectory analysis (Adams and Collyer 2007; Collyer and Adams 2007; Adams and Collyer 2009;
Collyer and Adams 2013), in that pairwise statistics are either vector lengths or angular differences between vectors.
These tests are different than trajectory analysis (seetrajectory.analysis
), however, because a factorial model
is not explicitly needed to contrast vectors between point factor levels nested within group factor levels. For angular differences
between factor-covariate slopes, either the angle or the vector correlation can be tested. It should be understood
that a vector correlation of 1 (parallel vectors), not 0, is the null hypothesis, meaning slopes are the same.
Pairwise tests are only performed if formulae are provided to compute such results.
The generic functions, print
, summary
, and plot
all work with advanced.procD.lm
.
The generic function, plot
, produces diagnostic plots for residuals of the linear fit. Note that there is an
argument in print/summary generic functions to print formulas as row names of the ANOVA table. If
formulas are long, it is recommended to make this argument, formula = FALSE
, in which case
"reduced" and "full" models will be acknowledged.
Notes for geomorph 3.0.7 and subsequent versions
The advanced.procD.lm
function now defers to the R package, RRPP
, specifically the anova.lm.rrpp
and
pairwise
functions. These functions perform all necessary computations needed for advanced.procD.lm
, as well as other
analyses. Therefore, advanced.procD.lm
is now a wrapper for these other functions.
The lm.rrpp
function can be used for multiple models, if one wishes to work directly in RRPP
, prior to using
anova.lm.rrpp
and pairwise
functions. The only difference in results (compared to version 3.0.6 and before)
should occur when comparing univariate slopes. Version 3.0.6 and earlier versions appended a vector of 1s to slopes as an ad-hoc strategy to make
computations work. This is no longer needed, as the RRPP
functions can better handle univariate data.
Notes for geomorph 3.0.6 and subsequent versions
For pairwise tests, previous versions assumed that pairwise comparisons of least-squares means used models with parallel slopes.
Under most circumstances, this assumption is safe (and preferred), as the estimation of mean differences otherwise would have to
assume something about the mean values of covariates as appropriate locations for estimating means. Version 3.0.6 and subseqent versions
find least-squares means that are truer to the model defined. For example, if a user defines a full model with parallel slopes, e.g.,
shape ~ x + A + B + A:B, where x is a covariate and A and B are factors, results should be no different than before. However, if a user
defines a full model which allows unique slopes, e.g., shape ~ x + A + B + x:A + x:B + A:B + x:A:B, least squares means will now be estimated
for mean values of x using the coefficients for x:A, x:B, and x:A:B (previous versions did not). This change is to made to
be consistent with other least-squares means estimation functions in other packages.
Notes for geomorph 3.0.4 and subsequent versions
Compared to previous versions of geomorph, users might notice differences in effect sizes. Previous versions used z-scores calculated with
expected values of statistics from null hypotheses (sensu Collyer et al. 2015); however Adams and Collyer (2016) showed that expected values
for some statistics can vary with sample size and variable number, and recommended finding the expected value, empirically, as the mean from the set
of random outcomes. Geomorph 3.0.4 and subsequent versions now center z-scores on their empirically estimated expected values and where appropriate,
log-transform values to assure statistics are normally distributed. This can result in negative effect sizes, when statistics are smaller than
expected compared to the average random outcome. For ANOVA-based functions, the option to choose among different statistics to measure effect size
is now a function argument.
An optional argument for including a phylogenetic tree of class phylo is included in this function. ANOVA performed on separate PGLS models is analogous
to a likelihood ratio test between models (Adams and Collyer 2018). Pairwise tests can also be performed after PGLS estimation of coefficients but users
should be aware that no formal research on the statistical properties (type I error rates and statistical power) of pairwise statistics with PGLS has yet
been performed. Using PGLS and analysis of pairwise statistics, therefore, assumes some risk.