A general function to perform Procrustes analysis of two- or three-dimensional landmark data that can include both fixed landmarks and sliding semilandmarks
gpagen(
A,
curves = NULL,
surfaces = NULL,
PrinAxes = TRUE,
max.iter = NULL,
tol = 1e-04,
ProcD = FALSE,
approxBE = FALSE,
sen = 0.5,
Proj = TRUE,
verbose = FALSE,
print.progress = TRUE,
Parallel = FALSE
)
Either an object of class geomorphShapes or a 3D array (p x k x n) containing landmark coordinates for a set of specimens. If A is a geomorphShapes object, the curves argument is not needed.
An optional matrix defining which landmarks should be treated as semilandmarks on boundary
curves, and which landmarks specify the tangent directions for their sliding. This matrix is generated
with readland.shapes
following digitizing of curves in StereoMorph, or may be generated
using the function define.sliders
.
An optional vector defining which landmarks should be treated as semilandmarks on surfaces
A logical value indicating whether or not to align the shape data by principal axes
The maximum number of GPA iterations to perform before superimposition is halted. The final number of iterations will be larger than max.iter, if curves or surface semilandmarks are involved, as max.iter will pertain only to iterations with sliding of landmarks.
A numeric value that can be adjusted for evaluation of the convergence criterion. If the criterion is lower than the tolerance, iterations will be stopped.
A logical value indicating whether or not Procrustes distance should be used as the criterion for optimizing the positions of semilandmarks (if not, bending energy is used)
A logical value for whether bending energy should be estimated via approximate thin-plate spline (TPS) mapping for sliding semilandmarks. Approximate TPS mapping is much faster and allows for more iterations, which might return more reliable results than few iterations with full TPS. If using full TPS, one should probably change max.iter to be few for large data sets.
A numeric value between 0.1 and 1 to adjust the sensitivity of true bending energy to use in an approximated thin plate mapping of bending energy. This value is the proportion of landmarks (excluding semilandmarks) to seek for estimating bending energy. Sliding semilandmarks are always included in the final set of landmarks used to make estimates, so the actual sensitivity is higher than the chosen value.
A logical value indicating whether or not the Procrustes aligned specimens should be projected into tangent space
A logical value for whether to include statistics that take a long time to calculate with large data sets, including a Procrustes distance matrix among specimens, a variance-covariance matrix among Procrustes coordinates (shape variables), and variances of landmark points. Formatting a data frame for coordinates and centroid size is also embedded within this option. This argument should be FALSE unless explicitly needed. For large data sets, it will slow down the analysis, extensively.
A logical value to indicate whether a progress bar should be printed to the screen.
For sliding semi-landmarks only, either a logical value to indicate whether
parallel processing should be used or a numeric value to indicate the number of cores to use in
parallel processing via the parallel
library.
If TRUE, this argument invokes forking of all processor cores, except one. If
FALSE, only one core is used. A numeric value directs the number of cores to use,
but one core will always be spared. Parallel processing is probably only valuable with
large data sets.
An object of class gpagen returns a list with the following components:
A (p x k x n) array of Procrustes shape variables, where p is the number of landmark points, k is the number of landmark dimensions (2 or 3), and n is the number of specimens. The third dimension of this array contains names for each specimen if specified in the original input array.
A vector of centroid sizes for each specimen, containing the names for each specimen if specified in the original input array.
The number of GPA iterations until convergence was found (or GPA halted).
Variance-covariance matrix among Procrustes shape variables.
Variances of landmark points.
The consensus (mean) configuration.
Procrustes distance matrix for all specimens (see details). Note that for large data sets, R might return a memory allocation error, in which case the error will be suppressed and this component will be NULL. For such cases, users can augment memory allocation and create distances with the dist function, independent from gpagen, using the coords or data output.
Number of landmarks.
Number of landmark dimensions.
Number of semilandmarks along curves.
Number of semilandmarks as surface points.
Data frame with an n x (pk) matrix of Procrustes shape variables and centroid size.
Final convergence criterion value.
Method used to slide semilandmarks.
The match call.
The function performs a Generalized Procrustes Analysis (GPA) on two-dimensional or three-dimensional
landmark coordinates. The analysis can be performed on fixed landmark points, semilandmarks on
curves, semilandmarks on surfaces, or any combination. If data are provided in the form of a 3D array, all
landmarks and semilandmarks are contained in this object. If this is the only component provided, the function
will treat all points as if they were fixed landmarks. To designate some points as semilandmarks, one uses
the "curves=" or "surfaces=" options (or both). To include semilandmarks on curves, a matrix defining
which landmarks are to be treated as semilandmarks is provided using the "curves=" option. This matrix contains
three columns that specify the semilandmarks and two neighboring landmarks which are used to specify the tangent
direction for sliding. The matrix may be generated using the function define.sliders
). Likewise,
to include semilandmarks
on surfaces, one must specify a vector listing which landmarks are to be treated as surface semilandmarks
using the "surfaces=" option. The "ProcD=FALSE" option (the default) will slide the semilandmarks
based on minimizing bending energy, while "ProcD=TRUE" will slide the semilandmarks along their tangent
directions using the Procrustes distance criterion. The Procrustes aligned specimens may be projected into tangent
space using the "Proj=TRUE" option.
The function also outputs a matrix of pairwise Procrustes Distances, which correspond to Euclidean distances between specimens in tangent space if "Proj=TRUE", or to the geodesic distances in shape space if "Proj=FALSE".
NOTE: Large datasets may exceed the memory limitations of R.
Generalized Procrustes Analysis (GPA: Gower 1975, Rohlf and Slice 1990) is the primary means by which shape variables are obtained from landmark data (for a general overview of geometric morphometrics see Bookstein 1991, Rohlf and Marcus 1993, Adams et al. 2004, Zelditch et al. 2012, Mitteroecker and Gunz 2009, Adams et al. 2013). GPA translates all specimens to the origin, scales them to unit-centroid size, and optimally rotates them (using a least-squares criterion) until the coordinates of corresponding points align as closely as possible. The resulting aligned Procrustes coordinates represent the shape of each specimen, and are found in a curved space related to Kendall's shape space (Kendall 1984). Typically, these are projected into a linear tangent space yielding Kendall's tangent space coordinates (i.e., Procrustes shape variables), which are used for subsequent multivariate analyses (Dryden and Mardia 1993, Rohlf 1999). Additionally, any semilandmarks on curves and surfaces are slid along their tangent directions or tangent planes during the superimposition (see Bookstein 1997; Gunz et al. 2005). Presently, two implementations are possible: 1) the locations of semilandmarks can be optimized by minimizing the bending energy between the reference and target specimen (Bookstein 1997), or by minimizing the Procrustes distance between the two (Rohlf 2010). Note that specimens are NOT automatically reflected to improve the GPA-alignment.
The generic functions, print
, summary
, and plot
all work with gpagen
.
The generic function, plot
, calls plotAllSpecimens
.
Starting with geomorph version 4.0, it is possible to use approximated thin-plate spline (TPS) mapping to estimate bending energy (when bending energy is used for sliding semi-landmarks). Approximated TPS has some computational benefit for large data sets (more GPA iterations are possible in a shorter time), but one should make sure first that results are reasonable. Approximated TPS uses a subset of points (semilandmarks) to estimate bending energy locally, where points are free to slide. Some subsets might be misrepresentative of the global (full configuration) bending energy, and strange results are possible. A comparison between TPS and approximated TPS outcomes could be tried with a few specimens to verify consistency of results before using approximated TPS on a large data set.
Choosing to use approximated TPS with only a few sliders is dangerous, as the subset of points might be too small to be reliable. Approximated TPS will work best for data sets with a sufficient number of surface landmarks. If a data set is nearly 100
Compared to older versions of geomorph, users might notice subtle differences in Procrustes shape variables when using semilandmarks (curves or surfaces). This difference is a result of using recursive updates of the consensus configuration with the sliding algorithms (minimized bending energy or Procrustes distances). (Previous versions used a single consensus through the sliding algorithms.) Shape differences using the recursive updates of the consensus configuration should be highly correlated with shape differences using a single consensus during the sliding algorithm, but rotational "flutter" can be expected. This should have no qualitative effect on inferential analyses using Procrustes residuals.
Adams, D. C., F. J. Rohlf, and D. E. Slice. 2004. Geometric morphometrics: ten years of progress following the 'revolution'. It. J. Zool. 71:5-16.
Adams, D. C., F. J. Rohlf, and D. E. Slice. 2013. A field comes of age: Geometric morphometrics in the 21st century. Hystrix.24:7-14.
Bookstein, F. L. 1991. Morphometric tools for landmark data: Geometry and Biology. Cambridge Univ. Press, New York.
Bookstein, F. L. 1997. Landmark methods for forms without landmarks: morphometrics of group differences in outline shape. 1:225-243.
Dryden, I. L., and K. V. Mardia. 1993. Multivariate shape analysis. Sankhya 55:460-480.
Gower, J. C. 1975. Generalized Procrustes analysis. Psychometrika 40:33-51.
Gunz, P., P. Mitteroecker, and F. L. Bookstein. 2005. semilandmarks in three dimensions. Pp. 73-98 in D. E. Slice, ed. Modern morphometrics in physical anthropology. Klewer Academic/Plenum, New York.
Kendall, D. G. 1984. Shape-manifolds, Procrustean metrics and complex projective spaces. Bulletin of the London Mathematical Society 16:81-121.
Mitteroecker, P., and P. Gunz. 2009. Advances in geometric morphometrics. Evol. Biol. 36:235-247.
Rohlf, F. J., and D. E. Slice. 1990. Extensions of the Procrustes method for the optimal superimposition of landmarks. Syst. Zool. 39:40-59.
Rohlf, F. J., and L. F. Marcus. 1993. A revolution in morphometrics. Trends Ecol. Evol. 8:129-132.
Rohlf, F. J. 1999. Shape statistics: Procrustes superimpositions and tangent spaces. Journal of Classification 16:197-223.
Rohlf, F. J. 2010. tpsRelw: Relative warps analysis. Version 1.49. Department of Ecology and Evolution, State University of New York at Stony Brook, Stony Brook, NY.
Zelditch, M. L., D. L. Swiderski, H. D. Sheets, and W. L. Fink. 2012. Geometric morphometrics for biologists: a primer. 2nd edition. Elsevier/Academic Press, Amsterdam.
# NOT RUN {
# Example 1: fixed points only
data(plethodon)
Y.gpa <- gpagen(plethodon$land, PrinAxes = FALSE)
summary(Y.gpa)
plot(Y.gpa)
# Example 2: points and semilandmarks on curves
data(hummingbirds)
###Slider matrix
hummingbirds$curvepts
# Using bending energy for sliding
Y.gpa <- gpagen(hummingbirds$land, curves = hummingbirds$curvepts,
ProcD = FALSE)
summary(Y.gpa)
plot(Y.gpa)
# Using Procrustes Distance for sliding
Y.gpa <- gpagen(hummingbirds$land, curves = hummingbirds$curvepts,
ProcD = TRUE)
summary(Y.gpa)
plot(Y.gpa)
# Example 3: points, curves and surfaces
data(scallops)
# Using Procrustes Distance for sliding
Y.gpa <- gpagen(A = scallops$coorddata, curves = scallops$curvslide,
surfaces = scallops$surfslide)
# NOTE can summarize as: summary(Y.gpa)
# NOTE can plot as: plot(Y.gpa)
# }
Run the code above in your browser using DataLab