mixed.mdmr
allows users to conduct multivariate distance matrix
regression (MDMR) in the context of a (hierarchically) clustered sample
without inflating Type-I error rates as a result of the violation of the
independence assumption. This is done by invoking a mixed-effects modeling
framework, in which clustering/grouping variables are specified as random
effects and the covariate effects of interest are fixed effects. The input
to mixed.mdmr
largely reflects the input of the lmer
function from the package lme4
insofar as the specification of
random and fixed effects are concerned (see Arguments for details). Note that
this function simply controls for the random effects in order to test the
fixed effects; it does not facilitate point estimation or inference on the
random effects.
mixed.mdmr(fmla, data, D = NULL, G = NULL, use.ssd = 1,
start.acc = 1e-20, ncores = 1)
An object with six elements and a summary function. Calling
summary(mixed.mdmr.res)
produces a data frame comprised of:
Value of the corresponding MDMR test statistic
Numerator degrees of freedom for the corresponding effect
The p-value for each effect.
In addition to the information in the three columns comprising
summary(res)
, the res
object also contains:
A data.frame reporting the precision of each p-value. If
analytic p-values were computed, these are the maximum error bound of the
p-values reported by the davies
function in CompQuadForm
. If
permutation p-values were computed, it is the standard error of each
permutation p-value.
Note that the printed output of summary(res)
will truncate p-values
to the smallest trustworthy values, but the object returned by
summary(res)
will contain the p-values as computed. The reason for
this truncation differs for analytic and permutation p-values. For an
analytic p-value, if the error bound of the Davies algorithm is larger than
the p-value, the only conclusion that can be drawn with certainty is that
the p-value is smaller than (or equal to) the error bound.
A one-sided linear formula object describing both the fixed-effects and random-effects part of the model, beginning with an ~ operator, which is followed by the terms to include in the model, separated by + operators. Random-effects terms are distinguished by vertical bars (|) separating expressions for design matrices from grouping factors. Two vertical bars (||) can be used to specify multiple uncorrelated random effects for the same grouping variable.
A mandatory data frame containing the variables named in formula.
Distance matrix computed on the outcome data. Can be either a
matrix or an R dist
object. Either D
or G
must be passed to mdmr()
.
Gower's centered similarity matrix computed from D
.
Either D
or G
must be passed to mdmr
.
The proportion of the total sum of squared distances (SSD)
that will be targeted in the modeling process. In the case of non-Euclidean
distances, specifying use.ssd
to be slightly smaller than 1.00 (e.g.,
0.99) can substantially lower the computational burden of mixed.mdmr
while maintaining well-controlled Type-I error rates and only sacrificing
a trivial amount of power. In the case of Euclidean distances the
computational burden of mixed.mdmr
is small, so use.ssd
should
be set to 1.00.
Starting accuracy of the Davies (1980) algorithm
implemented in the davies
function in the CompQuadForm
package (Duchesne & De Micheaux, 2010) that mdmr()
uses to compute
MDMR p-values.
Integer; if ncores
> 1, the parallel
package is used to speed computation. Note: Windows users must set
ncores = 1
because the parallel
pacakge relies on forking. See
mc.cores
in the mclapply
function in the
parallel
pacakge for more details.
Daniel B. McArtor (dmcartor@gmail.com) [aut, cre]
Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.
Duchesne, P., & De Micheaux, P. L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.
McArtor, D. B. (2017). Extending a distance-based approach to multivariate multiple regression (Doctoral Dissertation).
data("clustmdmrdata")
# Get distance matrix
D <- dist(Y.clust)
# Regular MDMR without the grouping variable
mdmr.res <- mdmr(X = X.clust[,1:2], D = D, perm.p = FALSE)
# Results look significant
summary(mdmr.res)
# Account for grouping variable
mixed.res <- mixed.mdmr(~ x1 + x2 + (x1 + x2 | grp),
data = X.clust, D = D)
# Signifance was due to the grouping variable
summary(mixed.res)
Run the code above in your browser using DataLab