This function performs Generalized Linear Mixed Models for binary
and continuous phylogenetic data, estimating regression
coefficients with approximate standard errors. It is modeled after
lmer
but is more general by allowing
correlation structure within random effects; these correlations can
be phylogenetic among species, or any other correlation structure,
such as geographical correlations among sites. It is, however, much
more specific than lmer
in that it can
only analyze a subset of1 the types of model designed handled by
lmer
. It is also much slower than
lmer
and requires users to specify
correlation structures as covariance
matrices. communityPGLMM
can analyze models in Ives and
Helmus (2011). It can also analyze bipartite phylogenetic data,
such as that analyzed in Rafferty and Ives (2011), by giving sites
phylogenetic correlations.
communityPGLMM(
formula,
data = list(),
family = "gaussian",
sp = NULL,
site = NULL,
random.effects = list(),
REML = TRUE,
s2.init = NULL,
B.init = NULL,
reltol = 10^-6,
maxit = 500,
tol.pql = 10^-6,
maxit.pql = 200,
verbose = FALSE
)communityPGLMM.gaussian(
formula,
data = list(),
family = "gaussian",
sp = NULL,
site = NULL,
random.effects = list(),
REML = TRUE,
s2.init = NULL,
B.init = NULL,
reltol = 10^-8,
maxit = 500,
verbose = FALSE
)
communityPGLMM.binary(
formula,
data = list(),
family = "binomial",
sp = NULL,
site = NULL,
random.effects = list(),
REML = TRUE,
s2.init = 0.25,
B.init = NULL,
reltol = 10^-5,
maxit = 40,
tol.pql = 10^-6,
maxit.pql = 200,
verbose = FALSE
)
communityPGLMM.binary.LRT(x, re.number = 0, ...)
communityPGLMM.matrix.structure(
formula,
data = list(),
family = "binomial",
sp = NULL,
site = NULL,
random.effects = list(),
ss = 1
)
# S3 method for communityPGLMM
summary(object, digits = max(3, getOption("digits") - 3), ...)
# S3 method for communityPGLMM
plot(x, digits = max(3, getOption("digits") - 3), ...)
communityPGLMM.predicted.values(x, show.plot = TRUE, ...)
an object of class communityPGLMM
the formula for fixed effects
the dataset
either gaussian
or binomial
depending on the model fit
the list of random effects
estimates of the regression coefficients
approximate standard errors of the fixed effects regression coefficients
approximate covariance matrix for the fixed effects regression coefficients
approximate Z scores for the fixed effects regression coefficients
approximate tests for the fixed effects regression coefficients being different from zero
random effects' standard deviations for the covariance matrix \(\sigma^2V\) for each random effect in order. For the linear mixed model, the residual variance is listed last
random effects variances for non-nested random effects
random effects variances for nested random effects
for linear mixed models, the residual vairance
for linear mixed models, the log-likelihood for either the restricted likelihood (REML=TRUE
) or the overall likelihood (REML=FALSE
). This is set to NULL for generalised linear mixed models
for linear mixed models, the AIC for either the restricted likelihood (REML=TRUE
) or the overall likelihood (REML=FALSE
). This is set to NULL for generalised linear mixed models
for linear mixed models, the BIC for either the restricted likelihood (REML=TRUE
) or the overall likelihood (REML=FALSE
). This is set to NULL for generalised linear mixed models
whether or not REML is used (TRUE
or FALSE
)
the user-provided initial estimates of s2
the user-provided initial estimates of B
the response (dependent) variable returned in matrix form
the predictor (independent) variables returned in matrix form (including 1s in the first column)
the residuals. For the generalized linear mixed model, these are the predicted residuals in the \(logit^{-1}\) space
the inverse of the covariance matrix for the entire system (of dimension (nsp*nsite) by (nsp*nsite))
predicted mean values for the generalized linear mixed model. Set to NULL for linear mixed models
matrices used to construct the nested design matrix
the design matrix for random effects
diagonal matrix that maps the random effects variances onto the design matrix
the convergence code provided by optim
number of iterations performed by optim
a two-sided linear formula object describing the
fixed-effects of the model; for example, Y ~ X
.
a data.frame
containing the variables
named in formula. The data frame should have long format with
factors specifying species and sites. communityPGLMM
will
reorder rows of the data frame so that species are nested within
sites. Please note that calling
as.data.frame.comparative.comm
will return your
comparative.comm
object into this format for you.
either gaussian
for a Linear Mixed Model, or
binomial
for binary dependent data.
a factor
variable that identifies species
a factor
variable that identifies sites
a list
that contains, for
non-nested random effects, lists of triplets of the form
list(X, group = group, covar = V)
. This is modeled after the
lmer
formula syntax (X | group)
where X
is a variable and group
is a grouping
factor. Note that group
should be either your sp
or
site
variable specified in sp
and site
. The
additional term V
is a covariance matrix of rank equal to
the number of levels of group that specifies the covariances among
groups in the random effect X
. For nested variable random
effects, random.effects
contains lists of quadruplets of the
form list(X, group1 = group1, covar = V, group2 = group2)
where group1
is nested within group2
.
whether REML or ML is used for model fitting. For the generalized linear mixed model for binary data, these don't have standard interpretations, and there is no log likelihood function that can be used in likelihood ratio tests.
an array of initial estimates of s2 for each random
effect that scales the variance. If s2.init is not provided for
family="gaussian"
, these are estimated using in a clunky way
using lm
assuming no phylogenetic signal. A better
approach is to run link[lme4:lmer]{lmer}
and use the output
random effects for s2.init
. If s2.init
is not
provided for family="binomial"
, these are set to 0.25.
initial estimates of \(B\), a matrix containing
regression coefficients in the model for the fixed effects. This
matrix must have dim(B.init)=c(p+1,1)
, where p
is the
number of predictor (independent) variables; the first element of
B
corresponds to the intercept, and the remaining elements
correspond in order to the predictor (independent) variables in the
formula. If B.init
is not provided, these are estimated
using in a clunky way using lm
or glm
assuming no phylogenetic signal. A better approach is to run
lmer
and use the output fixed effects for
B.init
.
a control parameter dictating the relative tolerance
for convergence in the optimization; see optim
.
a control parameter dictating the maximum number of
iterations in the optimization; see optim
.
a control parameter dictating the tolerance for convergence in the PQL estimates of the mean components of the binomial GLMM.
a control parameter dictating the maximum number of iterations in the PQL estimates of the mean components of the binomial GLMM.
if TRUE
, the model deviance and running
estimates of s2
and B
are plotted each iteration
during optimization.
communityPGLMM
object
which random.effect
in x
to be
tested
additional arguments to summary and plotting functions (currently ignored)
which of the random.effects
to produce
communityPGLMM object to be summarised
minimal number of significant digits for printing, as
in print.default
if TRUE
(default), display plot
Anthony R. Ives, cosmetic changes by Will Pearse
The vignette 'pez-pglmm-overview' gives a gentle
introduction to using PGLMMS. For linear mixed models (family
= 'gaussian'
), the function estimates parameters for the model of
the form, for example,
$$Y = \beta_0 + \beta_1x + b_0 + b_1x$$ $$b_0 ~ Gaussian(0, \sigma_0^2I_{sp})$$ $$b_1 ~ Gaussian(0, \sigma_0^2V_{sp})$$ $$\eta ~ Gaussian(0,\sigma^2)$$
where \(\beta_0\) and \(\beta_1\) are fixed effects, and \(V_{sp}\) is a variance-covariance matrix derived from a phylogeny (typically under the assumption of Brownian motion evolution). Here, the variation in the mean (intercept) for each species is given by the random effect \(b_0\) that is assumed to be independent among species. Variation in species' responses to predictor variable \(x\) is given by a random effect \(b_0\) that is assumed to depend on the phylogenetic relatedness among species given by \(V_{sp}\); if species are closely related, their specific responses to \(x\) will be similar. This particular model would be specified as
re.1 <- list(1, sp = dat$sp, covar = diag(nspp))
re.2 <- list(dat$X, sp = dat$sp, covar = Vsp)
z <- communityPGLMM(Y ~ X, data = data, family = "gaussian", random.effects = list(re.1, re.2))
The covariance matrix covar is standardized to have its determinant equal to 1. This in effect standardizes the interpretation of the scalar \(\sigma^2\). Although mathematically this is not required, it is a very good idea to standardize the predictor (independent) variables to have mean 0 and variance 1. This will make the function more robust and improve the interpretation of the regression coefficients. For categorical (factor) predictor variables, you will need to construct 0-1 dummy variables, and these should not be standardized (for obvious reasons).
For binary generalized linear mixed models (family =
'binomial'
), the function estimates parameters for the model of
the form, for example,
$$y = \beta_0 + \beta_1x + b_0 + b_1x$$ $$Y = logit^{-1}(y)$$ $$b_0 ~ Gaussian(0, \sigma_0^2I_{sp})$$ $$b_1 ~ Gaussian(0, \sigma_0^2V_{sp})$$
where \(\beta_0\) and \(\beta_1\) are fixed effects, and \(V_{sp}\) is a variance-covariance matrix derived from a phylogeny (typically under the assumption of Brownian motion evolution).
z <- communityPGLMM(Y ~ X, data = data, family =
'binomial', random.effects = list(re.1, re.2))
As with the linear mixed model, it is a very good idea to standardize the predictor (independent) variables to have mean 0 and variance 1. This will make the function more robust and improve the interpretation of the regression coefficients. For categorical (factor) predictor variables, you will need to construct 0-1 dummy variables, and these should not be standardized (for obvious reasons).
Ives, A. R. and M. R. Helmus. 2011. Generalized linear mixed models for phylogenetic analyses of community structure. Ecological Monographs 81:511-525.
Rafferty, N. E., and A. R. Ives. 2013. Phylogenetic trait-based analyses of ecological networks. Ecology 94:2321-2333.