fitting macroevolutionary models to phylogenetic trees
fitDiscrete(phy, dat,
model = c("ER","SYM","ARD","meristic"),
transform = c("none", "EB", "lambda", "kappa", "delta", "white"),
bounds = list(), control = list(method = c("subplex", "L-BFGS-B"),
niter = 100, FAIL = 1e+200, hessian = FALSE, CI = 0.95), ncores=NULL,
...)
# S3 method for gfit
as.Qmatrix(x, ...)
a phylogenetic tree of class phylo
data vector for a single trait, with names matching tips in phy
an Mkn model to fit to comparative data (see Details)
an evolutionary model used to transform the tree (see Details)
range to constrain parameter estimates (see Details)
settings used for optimization of the model likelihood
Number of cores. If NULL
then number of cores is detected
Object of class "gfit"
for S3 method as.Qmatrix
if model="meristic"
, ...
can dictate whether the matrix is asymmetric (symmetric=FALSE
)
fitDiscrete
returns a list with the following four elements:
is the function used to compute the model likelihood. The returned function (lik
) takes arguments that are necessary for the given model.
For instance, if estimating an untransformed ER
model, there would be a single argument (the transition rate) necessary for the lik
function. The tree and data are stored internally within the lik
function, which permits those elements to be efficiently reused when computing the likelihood under different parameter values. By default, the function evaluates the likelihood of the model by weighting root states in accordance with their conditional probability given the data (this is the "obs"
option; see FitzJohn et al. 2009). This default behavior can be changed in the call to lik
with lik(pars, root="flat")
, for instance, which would weight each state equally at the root. The other useful option is "given"
, where the user must also supply a vector (root.p
) of probabilities for each possible state. To make likelihoods roughly comparable between geiger and ape, one should use the option lik(pars, root="given", root.p=rep(1,k))
, where k
is the number of character states. See Examples for a demonstration
is a matrix of the used bounds for the relevant parameters estimated in the model. Warnings will be issued if any parameter estimates occur at the supplied (or default) parameter bounds
is a matrix of results from optimization. Rownames of the res
matrix are the optimization methods
(see optim
and subplex
). The columns in the res
matrix are the estimated
parameter values, the estimated model likelihood, and an indication of optimization convergence. Values of convergence not
equal to zero are not to be trusted
is a list of the primary results: estimates of the parameters, the maximum-likelihood estimate (lnL
) of the model, the
optimization method used to compute the MLE, the number of model parameters (k
, including one parameter for the root state), the AIC (aic
),
sample-size corrected AIC (aicc
). The number of observations for AIC computation is taken to be the number of trait values observed.
If the Hessian is used, confidence intervals on the parameter estimates (CI
) and the Hessian matrix (hessian
) are also returned
This function fits various likelihood models for discrete character evolution. The function returns parameter estimates and the likelihood for univariate datasets. All of the models are continuous-time Markov models of trait evolution (see Yang 2006 for a good general discussion of this type of model).
The model likelihood is maximized using methods available in optim
as well as subplex
. Optimization methods to be used within optim
can be specified through the control
object.
A number of random starting points are used in optimization and are given through the niter
element within the control
object (e.g., control$niter
). Finding the maximum likelihood fit is sometimes tricky, especially as the number of parameters in the model increases. Even in the example below, a slightly suboptimal fit is occasionally returned with the default settings fitting the general (ARD
) model. There is no rule of thumb for the number of iterations that will be appropriate for a given dataset and model, but one use the variance in fitted likelihoods across iterations as an indication of the difficulty of the likelihood space (see details of the res
object in Value). Twenty optimization iterations per parameter seems to be a decent starting point for fitting these models.
The FAIL
value within the control
object should be a large value that will be considerably far from -lnL of the maximum model likelihood. In most cases, the default setting for control$FAIL
will be appropriate. The Hessian may be used to compute confidence intervals (CI
) for the parameter estimates if the hessian
element in control
is TRUE.
The function can handle traits with any number of character states, under a range of models. The character model is specified by the model
argument:
ER is an equal-rates
model of where a single parameter governs all transition rates
SYM is a symmetric
model where forward and reverse transitions share the same parameter
ARD is an all-rates-different
model where each rate is a unique parameter
meristic is a model wherein transitions occur in a stepwise fashion (e.g., 1 to 2 to 3 to 2) without skipping intermediate steps; this requires a sensible coding of the character states as consecutive integers are assumed to be neighboring states
matrix is a user supplied model (given as a dummy matrix representing transition classes between states); elements that are zero signify rates that are also zero (see Examples)
The transform
argument allows one to test models where rates vary across the tree. Bounds for the relevant parameters of the tree transform
may be given through the bounds
argument. Several bounds can be given at a time. Default bounds under the different models are given below.
Options for transform
are as follows:
none is a model of rate constancy through time
EB is the Early-burst model (Harmon et al. 2010) and also called the ACDC
model (accelerating-decelerating; Blomberg et al. 2003). Set by the a
rate parameter, EB
fits a model where the rate of evolution increases or decreases exponentially through time, under the model r[t] = r[0] * exp(a * t), where r[0]
is the
initial rate, a
is the rate change parameter, and t
is time. Default bounds are a = c(min = -10, max = 10)
lambda is one of the Pagel (1999) models that fits the extent to which the phylogeny predicts covariance among trait values for species. The model effectively transforms the tree:
values of lambda
near 0 cause the phylogeny to become more star-like, and a lambda
value of 1 recovers the none
model. Default
bounds are lambda = c(min = 0, max = 1)
kappa is a punctuational (speciational) model of trait evolution (Pagel 1999), where character divergence is related to the number of speciation events between two species. Note that if
there are speciation events in the given phylogeny (due to extinction or incomplete sampling), interpretation under the kappa
model may be difficult. Considered as a tree
transformation, the model raises all branch lengths to an estimated power (kappa
). Default bounds are kappa = c(min = 0, max = 1)
delta is a time-dependent model of trait evolution (Pagel 1999). The delta
model is similar to ACDC
insofar as the delta
model fits the relative contributions of
early versus late evolution in the tree to the covariance of species trait values. Where delta
is greater than 1, recent evolution has been relatively fast; if delta
is less
than 1, recent evolution has been comparatively slow. Intrepreted as a tree transformation, the model raises all node depths to an estimated power (delta
). Default bounds are delta = c(min = 0, max = 3)
white is a white
-noise (non-phylogenetic) model, which converts the tree into a star phylogeny
Yang Z. 2006. Computational Molecular Evolution. Oxford University Press: Oxford. FitzJohn RG, WP Maddison, and SP Otto. 2009. Estimating trait-dependent speciation and extinction rates from incompletely resolved molecular phylogenies. Systematic Biology 58:595-611.
# NOT RUN {
## match data and tree
tmp=get(data(geospiza))
td=treedata(tmp$phy, tmp$dat)
geo=list(phy=td$phy, dat=td$data)
gb=round(geo$dat[,5]) ## create discrete data
names(gb)=rownames(geo$dat)
tmp=fitDiscrete(geo$phy, gb, model="ER", control=list(niter=5), ncores=2) #-7.119792
## using the returned likelihood function
lik=tmp$lik
lik(0.3336772, root="obs") #-7.119792
lik(0.3336772, root="flat") #-8.125354
lik(0.3336772, root="given", root.p=rep(1/3,3)) #-8.125354
lik(0.3336772, root="given", root.p=c(0, 1, 0)) #-7.074039
lik(c(0.3640363), root="given", root.p=rep(1,3)) #-7.020569 & comparable to ape:::ace solution
# }
# NOT RUN {
# general model (ARD)
fitDiscrete(geo$phy, gb, model="ARD", ncores=1) #-6.064573
# user-specified rate classes
mm=rbind(c(NA, 0, 0), c(1, NA, 2), c(0, 2, NA))
fitDiscrete(geo$phy, gb, model=mm, ncores=1) #-7.037944
# symmetric-rates model
fitDiscrete(geo$phy, gb, model="SYM", ncores=1)#-6.822943
# }
Run the code above in your browser using DataLab