Performs stochastic character mapping (Huelsenbeck et al., 2003) using several different alternative methods.
make.simmap(tree, x, model="SYM", nsim=1, ...)
simmap(object, ...)
A object of class "simmap"
or "multiSimmap"
which consists of an object of class "phylo"
(or a list of such objects with class "multiPhylo"
), with the following additional elements:
a list of named vectors containing the times spent in each state on each branch, in the order in which they occur.
a matrix containing the total time spent in each state along each edge of the tree.
the assumed or sampled value of Q
.
the log-likelihood of the assumed or sampled Q
.
a phylogenetic tree as an object of class "phylo"
, or a list of trees as an object of class "multiPhylo"
.
a vector containing the tip states for a discretely valued character, or a matrix containing the prior probabilities of tip states in rows and character states as column names. The names (if x
is a vector) or row names (if x
is a matrix) should match the tip labels of the tree. The vector can be of class "factor"
, "character"
, or "numeric"
(although in the lattermost case its content should obviously be only integer values).
a character string containing the model or a transition model specified in the form of a matrix. See ace
for more details.
number of simulations. If tree
is an object of class "multiPhylo"
, then nsim
simulations will be conducted per input tree.
optional arguments. So far, pi
gives the prior distribution on the root node of the tree. Acceptable values for pi
are "equal"
, "estimated"
, or a vector with the frequencies. If pi="estimated"
then the stationary distribution is estimated by numerically solving pi*Q=0
for pi
, and this is used as a prior on the root. If pi="fitzjohn"
, then the Fitzjohn et al. (2009) root prior is used. Finally, if pi
is a numeric vector then the root state will be sampled from this vector. The function defaults to pi="equal"
which results in the root node being sampled from the conditional scaled likelihood distribution at the root. message
tells whether or not to print a message containing the rate matrix, Q and state frequencies. message
defaults to TRUE
. For optional argument Q="mcmc"
(see below) the mean value of Q
from the posterior sample is printed. tol
gives the tolerance for zero elements in Q
. (Elements less then tol
will be reset to tol
). Optional argument Q
can be a string ("empirical"
or "mcmc"
), or a fixed value of the transition matrix, Q. If "empirical"
than a single value of Q, the most likely value, is used for all simulations. If "mcmc"
, then nsim
values of Q are first obtained from the posterior distribution for Q using Bayesian MCMC, then a simulated stochastic character map is generated for each sampled value of Q. Optional argument vQ
can consist of a single numeric value or a vector containing the variances of the (normal) proposal distributions for the MCMC. The order of vQ
is assumed to be in the order of the index.matrix
in ace
for the chosen model. prior
is a list containing alpha
and beta
parameters for the \(\Gamma\) prior distribution on the transition rates in Q. Note that alpha
and beta
can be single values or vectors, if different priors are desired for each value in the transition matrix Q. As for vQ
, the order of prior
is assumed to correspond with the order of index.matrix
as in ace
. prior
can also be given the optional logical value use.empirical
which tells the function whether or not to give the prior distribution the empirical mean for Q. If TRUE
then only prior$beta
is used and prior$alpha
is set equal to prior$beta
times the empirical mean of Q. burnin
and samplefreq
are burn-in and sample frequency for the MCMC, respectively.
for generic simmap
method, object of various classes: for instance, an object of class "fitMk"
from fitMk
.
Liam Revell liam.revell@umb.edu
For Q="empirical"
, make.simmap
first fits a continuous-time reversible Markov model for the evolution of x
and then simulates stochastic character histories using that model and the tip states on the tree. This is the same procedure that is described in Bollback (2006), except that simulation is performed using a fixed value of the transition matrix, Q, instead of by sampling Q from its posterior distribution.
For Q="mcmc"
, make.simmap
first samples Q nsim
times from the posterior probability distribution of Q using MCMC, then it simulates nsim
stochastic maps conditioned on each sampled value of Q.
For Q
set to a matrix, make.simmap
samples stochastic mappings conditioned on the fixed input matrix.
make.simmap
uses code that has been adapted from ape's function ace
(by Paradis et al.) to perform Felsenstein's pruning algorithm to compute the likelihood.
As of phytools >= 0.2-33 x
can be a vector of states or a matrix containing the prior probabilities of tip states in rows. In this case the column names of x
should contain the states, and the row names should contain the tip names.
Note that there was a small (but potentially significant) bug in how node states were simulated by make.simmap
in versions of phytools <= 0.2-26. Between phytools 0.2-26 and 0.2-36 there was also a bug for asymmetric models of character change (e.g., model="ARD"
). Finally, between phytools 0.2-33 and phytools 0.2-47 there was an error in use of the conditional likelihoods for the root node, which caused the root node of the tree to be sampled incorrectly. Giorgio Bianchini pointed out that in phytools 1.0-1 (and probably prior recent versions) there was an error sampling the state at the root node of the tree based on the input prior (pi
) supplied by a user -- except for pi="equal"
(a flat prior, the default) or for a prior distribution in which one or another state was known to be the global root state (e.g., pi=c(1,0)
, pi=c(0,1)
, etc.). All of these issues should be fixed in the current and all later versions.
If tree
is an object of class "multiPhylo"
then nsim
stochastic maps are generated for each input tree.
Bollback, J. P. (2006) Stochastic character mapping of discrete traits on phylogenies. BMC Bioinformatics, 7, 88.
FitzJohn, R. G., W. P. Maddison, and S. P. Otto (2009) Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Systematic Biology, 58, 595-611.
Huelsenbeck, J. P., R. Neilsen, and J. P. Bollback (2003) Stochastic mapping of morphological characters. Systematic Biology, 52, 131-138.
Paradis, E., J. Claude, and K. Strimmer (2004) APE: Analyses of phylogenetics and evolution in R language. Bioinformatics, 20, 289-290.
Revell, L. J. (2024) phytools 2.0: an updated R ecosystem for phylogenetic comparative methods (and other things). PeerJ, 12, e16505.
Revell, L. J. and L. J. Harmon (2022) Phylogenetic Comparative Methods in R. Princeton University Press.
ancr
, brownie.lite
, brownieREML
, countSimmap
, describe.simmap
, evol.vcv
, plotSimmap
, read.simmap
, write.simmap
if (FALSE) {
## load tree and data from Revell & Collar (2009)
data(sunfish.tree)
data(sunfish.data)
## extract discrete character (feeding mode)
fmode<-setNames(sunfish.data$feeding.mode,
rownames(sunfish.data))
## fit model
er_model<-fitMk(sunfish.tree,fmode,model="ER",
pi="fitzjohn")
## do stochastic mapping
sunfish_smap<-simmap(er_model)
## print a summary of the stochastic mapping
summary(sunfish_smap)
## plot a posterior probabilities of ancestral states
cols<-setNames(c("blue","red"),levels(fmode))
plot(summary(sunfish_smap),colors=cols,ftype="i")
legend("topleft",c("non-piscivorous","piscivorous"),
pch=21,pt.bg=cols,pt.cex=2)
par(mar=c(5.1,4.1,4.1,2.1),las=1)
## plot posterior density on the number of changes
plot(density(sunfish_smap),bty="l")
title(main="Posterior distribution of changes of each type",
font.main=3)
## reset par to default
par(las=0)}
Run the code above in your browser using DataLab