Learn R Programming

phytools (version 2.4-4)

fitMk: Fits extended Mk model for discrete character evolution

Description

The functions fitMk, fitmultiMk, fitpolyMk, fitHRM, fitMk.parallel, fitgammaMk, fitfnMk, and mcmcMk fit various flavors of the extended Mk model (Lewis, 2001) for discrete character evolution on a reconstructed phylogeny.

Usage

fitMk(tree, x, model="SYM", fixedQ=NULL, ...)
# S3 method for fitMk
plot(x, ...)
# S3 method for gfit
plot(x, ...)
fitmultiMk(tree, x, model="ER", ...)
fitpolyMk(tree, x, model="SYM", ordered=FALSE, ...)
graph.polyMk(k=2, model="SYM", ordered=FALSE, ...)
# S3 method for fitpolyMk
plot(x, ...)
mcmcMk(tree, x, model="ER", ngen=10000, ...)
# S3 method for mcmcMk
plot(x, ...)
# S3 method for mcmcMk
density(x, ...)
# S3 method for density.mcmcMk
plot(x, ...)
fitHRM(tree, x, model="ARD", ncat=2, ...)
# S3 method for fitHRM
plot(x, ...)
fitMk.parallel(tree, x, model="SYM", ncores=1, ...)
fitgammaMk(tree, x, model="ER", fixedQ=NULL, nrates=8, ...)
fitfnMk(tree, x, model="polynomial", degree=2, ...)

Value

An object of class "fitMk", "fitmultiMk", "fitpolyMk", "mcmcMk", "fitHRM", "fitgammaMk", or "fitfnMk". In the case of density.mcmcMk an object of class "density.mcmcMk".

plot.fitMk, plot.gfit, and plot.HRM invisibly return the coordinates of vertices of the plotted Q-matrix.

Arguments

tree

an object of class "phylo". In the case of fitmultiMk an object of class "simmap" with a mapped discrete character.

x

a vector (or numeric matrix) of tip values for species; names(x) (or rownames(x)) should be the species names. In the case of plot and density methods, an object of the appropriate class.

model

model. See make.simmap or ace for details. For fitfnMk the only option is presently model="polynomial".

fixedQ

fixed value of transition matrix Q, if one is desired.

ordered

for fitpolyMk, a logical value indicating whether or not the character should be treated as ordered. For now the function assumes alphanumerical order (i.e., numbers sorted by their initial and then successive digits followed by characters or character strings in alphabetical order).

k

For graph.polyMk, the number of monomorphic states for the discrete trait.

ngen

number of generations of MCMC for mcmcMk.

ncat

number of rate categories (per level of the discrete trait) in the hidden-rate model.

ncores

number of cores for fitMk.parallel.

nrates

number of rate categories for discretized \(\Gamma\) distribution.

degree

the degree of the polynomial for fitfnMk. (Defaults to degree=2.)

...

optional arguments, including pi, the prior distribution at the root node (defaults to pi="equal"). Other options for pi include pi="fitzjohn" (which implements the prior distribution of FitzJohn et al. 2009), pi="estimated" (which finds the stationary distribution of state frequencies and sets that as the prior), or an arbitrary prior distribution specified by the user. For plot method optional arguments include (but may not be limited to): signif, the number of digits for the rates to be plotted; main, a character vector of length two with the headings for each subplot; cex.main, cex.traits, and cex.rates, font sizes for the various text elements of the plot; and show.zeros, a logical argument specifying whether or not to plot arrows with the ML estimated transition rate is not different from zero (with tolerance specified by the optional argument tol). Finally, for fitpolyMk, both order (an evolutionary sequence for the monomorphic condition) and max.poly can be set for the ordered=TRUE model. If not set, order defaults to alphanumeric order, and max.poly defaults to the highest level of polymorphism observed in the data. (max.poly can also be specified for an unordered polymorphic trait model, and if not specified, defaults to the number of distinct monomorphic states of the data.)

Author

Liam Revell liam.revell@umb.edu

Details

The function fitMk fits a so-called extended Mk model for discrete character evolution (Lewis, 2001).

plot.fitMk plots an object of class "fitMk" returned by fitMk. plot.gfit plots an object of class "gfit" from geiger's fitDiscrete function. Both plots portray the fitted model using a graph of arrows connecting states.

The function fitmultiMk fits an Mk model in which the transition rates between character states are allowed to vary depending on the mapped state of a discrete character on the tree following Revell et al. (2024). It can be combined with, for example, paintSubTree to test hypotheses about how the process of discrete character evolution for x varies between different parts of the tree.

The function fitgammaMk fits an Mk model in which the edge rates are assumed to have been sampled randomly from a \(\Gamma\) distribution with mean of 1.0 and shape parameter \(\alpha\) (Revell and Harmon, In review).

The function fitfnMk fit an ordered Mk model in which the backward and forward transition rates between adjacent levels of the trait vary according to a functional form. Presently that function form is an nth degree polynomial, in which degree is set by the user (but defaults to degree = 2).

The function fitpolyMk fits an Mk model to data for a discrete character with intraspecific polymorphism. Polymorphic species should be coded with the name of the two or more states recorded for the species separated by a plus sign + (e.g., A+B would indicate that both states A and B are found in the corresponding taxon). Invariably it's assumed that transitions between states must occur through a polymorphic condition, whereas transitions cannot occur directly between two incompatible polymorphic conditions. For instance, a transition between A+B and B+C would have to occur through the monomorphic state B. At time of writing, this function permits the models "ER" (equal rates for all permitted transitions), "SYM" (symmetric backward & forward rates for all permitted transitions), "ARD" (all-rates-different for permitted transitions), and a new model called "transient" in which the acquisition of polymorphism (e.g., A -> A+B) is assumed to occur at a different rate than its loss (e.g., A+B -> B). The method plot.fitpolyMk plots the fitted Mk model with intraspecific polymorphism.

The function mcmcMk runs a Bayesian MCMC version of fitMk. The shape of the prior distribution of the transition rates is \(\Gamma\), with \(\alpha\) and \(\beta\) via the argument prior, which takes the form of a list. The default value of \(\alpha\) is 0.1, and \(\beta\) defaults to a value such that \(\alpha/\beta\) is equal to the parsimony score for x divided by the sum of the edge lengths of the tree. The shape of the proposal distribution is normal, with mean zero and a variance that can be controlled by the user via the optional argument prior.var. The argument auto.tune, if TRUE or FALSE, indicates whether or not to 'tune' the proposal variance up or down to target a particular acceptance rate (defaults to 0.5). auto.tune can also be a numeric value between 0 and 1, in which case this value will be the target acceptance ratio. The argument plot indicates whether the progress of the MCMC should be plotted (defaults to TRUE, but runs much faster when set to FALSE).

The method plot.mcmcMk plots a log-likelihood trace and a trace of the rate parameters from the MCMC. (This the same graph that is created by setting plot=TRUE in mcmcMk.) The method density.mcmcMk computes a posterior density on the transition rates in the model from the posterior sample obtained in the MCMC, will import the package coda if it is available, and returns an object of class "density.mcmcMk". Finally, the method plot.density.mcmcMk creates a plot of the posterior density (or a set of plots) for the transition rates between states.

Finally, the function fitHRM fits a hidden-rate Mk model following Beaulieu et al. (2013). For the hidden-rate model we need to specify a number of rate categories for each level of the trait - and this can be a vector of different values for each trait. We can also choose a model ("ER", "SYM", or "ARD"), as well as whether or not to treat the character as a 'threshold' trait (umbral=TRUE, defaults to FALSE). This latter model is basically one that allows absorbing conditions for some hidden states. Since this can be a difficult optimization problem, the optional argument niter sets the number of optimization iterations to be run. niter defaults to niter=10. To fit the same default hidden-rates model as is implemented in corHMM, one should set corHMM_model=TRUE and ordered_hrm=FALSE.

Note that (by default) both fitMk and fitmultiMk recycle code from ace in the ape package for computing the likelihood. (If the optional argument lik.func="pruning" then alternative, slightly faster, phytools code for the pruning algorithm is used.) fitpolyMk, mcmcMk, and fitHRM use fitMk internally to compute the likelihood.

References

Beaulieu, J. M., B. C. O'Meara, and M. J. Donoghue (2013) Identifying hidden rate changes in the evolution of a binary morphological character: The evolution of plant habit in campanulid angiosperms. Systematic Biology, 62, 725-737.

FitzJohn, R. G., W. P. Maddison, and S. P. Otto (2009) Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Systematic Biology, 58, 595-611.

Lewis, P. O. (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Systematic Biology, 50, 913-925.

Revell, L. J. (2024) phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ, 12, e16505.

Revell, L. J. and L. J. Harmon (2022) Phylogenetic Comparative Methods in R. Princeton University Press.

Revell, L. J. and L. J. Harmon. In review. A discrete character evolution model for phylogenetic comparative biology with \(\Gamma\)-distributed rate heterogeneity among branches of the tree.

Revell, L. J., K. P. Schliep, D. L. Mahler, and T. Ingram (2024) Testing for heterogeneous rates of discrete character evolution on phylogenies. Journal of Evolutionary Biology, 37, 1591-1602.

See Also

ace, make.simmap, simmap

Examples

Run this code
## load tree and data from Revell & Collar (2009)
data(sunfish.tree)
data(sunfish.data)

## extract discrete character (feeding mode)
fmode<-setNames(sunfish.data$feeding.mode,
    rownames(sunfish.data))

## fit "ER" model
fit.ER<-fitMk(sunfish.tree,fmode,model="ER")
print(fit.ER)

## fit "ARD" model
fit.ARD<-fitMk(sunfish.tree,fmode,model="ARD")
print(fit.ARD)

## compare the models
AIC(fit.ER,fit.ARD)

## load tree and data from Benitez-Alvarez et al. (2000)
data(flatworm.data)
data(flatworm.tree)

## extract discrete character (habitat)
habitat<-setNames(flatworm.data$Habitat,
    rownames(flatworm.data))

## fit polymorphic models "ER" and "transient"
fitpoly.ER<-fitpolyMk(flatworm.tree,habitat,
    model="ER")
fitpoly.transient<-fitpolyMk(flatworm.tree,habitat,
    model="transient")

## print fitted models
print(fitpoly.ER)
print(fitpoly.transient)

## compare model
AIC(fitpoly.ER,fitpoly.transient)

## plot models
par(mfrow=c(2,1))
plot(fitpoly.ER)
mtext("a) ER polymorphic model",adj=0,line=1)
plot(fitpoly.transient)
mtext("b) Transient polymorphic model",adj=0,
     line=1)

## reset par to default
par(mfrow=c(1,1))

Run the code above in your browser using DataLab