Learn R Programming

RRPP (version 2.0.3)

measurement.error: Evaluation of measurement error for two or more multivariate measurements, for common research subjects

Description

Function performs analyses concerned with the repeatability (reliability) of multivariate data (measurements) collected from the same research subjects. Although there is no requirement for repeated measurements on all research subjects, the analysis assumes that multiple observations are made.

Usage

measurement.error(
  data,
  Y,
  subjects,
  replicates,
  groups = NULL,
  iter = 999,
  seed = NULL,
  multivariate = FALSE,
  use.PCs = TRUE,
  tol = 0.001,
  Parallel = FALSE,
  turbo = TRUE,
  print.progress = FALSE,
  verbose = FALSE
)

Value

Objects of class "measurement.error" return the same objects as a lm.rrpp fit, plus a list of the following:

AOV

Analysis of variance to test for systematic error, based on dispersion of values.

mAOV

Multivariate AOV based on product of the inverse of the random component (SSCP) of ME times the systematic component of ME.

SSCP

The sums of squares and cross-products matrices for model effects.

SSCP.ME.product

The products of the inverse of the random ME SSCP and the SSCP matrices for systematic ME,. These are the same matrix products used for eigenanalysis. This is the observed matrix.

SSCP.ME.product.std

A list of the symmetric forms of standardized SSCP.ME.products that yield orthogonal eigenvectors.

Arguments

data

A required data frame, either of class data.frame or class rrpp.data.frame. This function cannot be used without a data frame. All arguments for data and variables are names that must exist in the data frame.

Y

A name for a matrix (n x p) of data for n observations and p variables that can be found in the data frame. For example, Y = "morphData".

subjects

A name for a vector or factor of research subjects, found within the data frame (each subject should occur twice or more). The length of the vector in the data frame must equal the number of observations and will be coerced into a factor. For example, subjects = "ID".

replicates

A name for a vector or factor for replicate measurements for research subjects, found within the data frame. The length of the vector in the data frame must equal the number of observations and will be coerced into a factor. For example, replicates = "Rep".

groups

An optional name for a vector in the data frame, coercible to factor, to be included in the linear model (as an interaction with replicates). This would be of interest if one were concerned with systematic ME occurring perhaps differently among certain strata within the data. For example, systematic ME because of an observer bias might only be observed with females or males, in which case the argument might be: groups = "Sex".

iter

Number of iterations for significance testing

seed

An optional argument for setting the seed for random permutations of the resampling procedure. If left NULL (the default), the exact same P-values will be found for repeated runs of the analysis (with the same number of iterations). If seed = "random", a random seed will be used, and P-values will vary. One can also specify an integer for specific seed values, which might be of interest for advanced users.

multivariate

Logical value for whether to include multivariate analyses. Intraclass correlation matrices and relative eigenanalysis are based on products of sums of squares and cross-products (SSCP) matrices, some of which must be inverted and potentially require significant computation time. If FALSE, only statistics based on dispersion of values are calculated.

use.PCs

A logical argument for whether to use the principal components of the data. This might be helpful for relative eigenanalysis, and if p > n, in which case inverting singular covariance matrices would not be possible.

tol

A value indicating the magnitude below which components should be omitted, if use.PCs is TRUE. (Components are omitted if their standard deviations are less than or equal to tol times the standard deviation of the first component.) See ordinate for more details.

Parallel

The same argument as in lm.rrpp to govern parallel processing ( either a logical value -- TRUE or FALSE -- or the number of threaded cores to use). See lm.rrpp for additional details.

turbo

Logical value for whether to suppress coefficient estimation in RRPP iteration, thus turbo-charging RRPP.

print.progress

A logical value to indicate whether a progress bar should be printed to the screen.

verbose

A logical value to indicate if all the output from an lm.rrpp analysis should be retained. If FALSE, only the needed output for summaries and plotting is retained.

Author

Michael Collyer

Details

This function performs analyses as described in Collyer and Adams (2024) to assess systematic and random components of measurement error (ME). It basically performs ANOVA with RRPP, but with different restricted randomization strategies. The reliability of research subject variation can be considered by restricting randomization within replicates; the consistency of replicate measures can be considered by restricting randomization within subjects. Inter-subject variation remains constant across all random permutations within subjects and inter-replicate variation remains constant across all random permutations within replicates. Type II sums of squares and cross-products (SSCP) are calculated to assure conditional estimation.

The results include univariate-like statistics based on dispersion of values and eigenanalysis performed on a signal to noise matrix product of SSCP matrices (sensu Bookstein and Mitteroecker, 2014) including the inverse of the random component of ME and the systematic component of ME. The multivariate test is a form of multivariate ANOVA (MANOVA), using RRPP to generate sampling distributions of the major eigenvalue (Roy's maximum root). Likelihood-ratio tests can also be performed using lr_test.

Intraclass correlation coefficients (ICC) can also be calculated (using ICCstats), both based on dispersion of values and covariance matrices, as descriptive statistics. Details are provided in ICCstats.

References

Collyer, M.L. and D.C. Adams. 2024. Interrogating Random and Systematic Measurement Error in Morphometric Data. Evolutionary Biology, 51, 179–20.

Bookstein, F.L., & Mitteroecker, P. (2014). Comparing covariance matrices by relative eigenanalysis, with applications to organismal biology. Evolutionary Biology, 41(2), 336-350.

See Also

lm.rrpp.ws, manova.update, lr_test

Examples

Run this code
if (FALSE) {
# Measurement error analysis on simulated data of fish shapes

data(fishy)

# Example two digitization replicates of the same research subjects
rep1 <- matrix(fishy$coords[1,], 11, 2, byrow = TRUE)
rep2 <- matrix(fishy$coords[61,], 11, 2, byrow = TRUE)
plot(rep1, pch = 16, col = gray(0.5, alpha = 0.5), cex = 2, asp = 1)
points(rep2, pch = 16, col = gray(0.2, alpha = 0.5), cex = 2, asp = 1)

# Analysis unconcerned with groups 

ME1 <- measurement.error(
  Y = "coords",
  subjects = "subj",
  replicates = "reps",
  data = fishy)

anova(ME1)
ICCstats(ME1, subjects = "Subjects", with_in = "Systematic ME")
plot(ME1)

# Analysis concerned with groups 

ME2 <- measurement.error(
  Y = "coords",
  subjects = "subj",
  replicates = "reps",
  groups = "groups",
  data = fishy)

anova(ME2)
ICCstats(ME2, subjects = "Subjects", 
  with_in = "Systematic ME", groups = "groups")
P <- plot(ME2)
focusMEonSubjects(P, subjects = 18:20, shadow = TRUE)
}

Run the code above in your browser using DataLab