simFossilTaxa_SRCond(r, avgtaxa, p, q, anag.rate = 0, prop.bifurc = 0,
prop.cryptic = 0, nruns = 1, mintime = 1, maxtime = 1000,
minExtant = 0, maxExtant = NULL, count.cryptic = FALSE,
print.runs = FALSE, plot = FALSE)
simFossilTaxa
so please the help file for that function in addition.
simFossilTaxa_SRCond is a wrapper for simFossilTaxa for when clades of a
particular size are desired, post-sampling. simFossilTaxa simulates a
birth-death process (Kendall, 1948; Nee, 2006), but unlike most functions
for this implemented in R, this function enmeshes the simulation of
speciation and extinction with explicit models of how lineages are
morphologically differentiated, as morphotaxa are the basic units of
paleontological estimates of diversity and phylogenetics. For more details
on the workings of simFossilTaxa and many of the arguments involved, please
see the help file for simFossilTaxa (simFossilTaxa
).
simFossilTaxa_SRCond first calculates the expected proportion of taxa
sampled, given the sampling rate and the rates which control lineage
termination: extinction, anagenesis and bifurcation. The average original
clade size needed to produce, on average, a given number of sampled taxa
(the argument 'avgtaxa') is calculated with the following equation:
$N = avgtaxa / (r / (r + (q + anag.rate + (prop.bifurc * p)) ))$
We will call that quantity N. Note that the quantity (prop.bifurc * p)
describes the rate of bifurcation when there is no cryptic cladogenesis, as
prop.bifurc is the ratio of budding to bifurcating cladogenesis. This
equation will diverge in ways that are not easily predicted as the rate of
cryptic speciation increases. Note, as of version 1.5, this equation was
altered to the form above. The previous form was similar and at values of
avgtaxa greater than 10 or so, produces almost identical values. The above
is preferred for its relationship to taxonomic completeness (Solow and
Smith, 1997).
Next, this value is used with simFossilTaxa, where mintaxa is set to N and
maxtaxa set to 2*N. simFossilTaxa_SRcond will generally produce simulated
datasets that are generally of that size or larger post-sampling (although
there can be some variance). Some combinations of parameters may take an
extremely long time to find large enough datasets. Some combinations may
produce very strange datasets that may have weird structure that is only a
result of the conditioning (for example, the only clades that have many taxa
when net diversification is low or negative will have lots of very early
divergences, which could impact analyses). Needless to say, conditioning can
be very difficult.simFossilTaxa
, sampleRanges
,
simPaleoTrees
, taxa2phylo
,
taxa2cladogram
set.seed(444)
avgtaxa <- 50
r <- 0.5
#using the SRcond version
taxa <- simFossilTaxa_SRCond(r=r,p=0.1,q=0.1,nruns=20,avgtaxa=avgtaxa)
#now let's use sampleRanges and count number of sampled taxa
ranges <- lapply(taxa,sampleRanges,r=r)
ntaxa <- sapply(ranges,function(x) sum(!is.na(x[,1])))
hist(ntaxa)
mean(ntaxa)
#works okay... some parameter combinations are difficult to get right number of taxa
layout(1)
Run the code above in your browser using DataLab