pqr2Ps: Joint Probability of A Clade Surviving Infinitely or Being Sampled Once
Description
Given the rates of branching, extinction and sampling, calculates the joint
probability of a random clade (of unknown size, from 1 to infinite) either
(a) never going extinct on an infinite time-scale or (b) being sampled at
least once, if it does ever go extinct. As we often assume perfect or close
to perfect sampling at the modern (and thus we can blanket assume that living
groups are sampled), we refer to this value as the Probability of Being Sampled,
or simply P(s). This quantity is useful for calculating the
probability distributions of waiting times that depend on a clade being
sampled (or not).
Usage
pqr2Ps(p, q, r, useExact = TRUE)
Arguments
p
Instantaneous rate of speciation (lambda). If the underlying model assumed is
anagenetic (e.g. taxonomic change within a single lineage, 'phyletic evolution')
with no branching of lineages, then p will be used as the rate of anagenetic differentiation.
q
Instantaneous rate of extinction (mu)
r
Instantaneous rate of sampling
useExact
If TRUE, an exact solution developed by Emily King is
used; if FALSE, an iterative, inexact solution is used, which is somewhat slower
(in addition to being inexact...).
Value
Returns a single numerical value, representing the joint probability of a clade
generated under these rates either never going extinct or being sampled before
it goes extinct.
Details
Note that the use of the word 'clade' here can mean a monophyletic group
of any size, including a single 'species' (i.e. a single phylogenetic branch)
that goes extinct before producing any descendants. Many scientists I have
met reserve the word 'clade' for only groups that contain at least one
branching event, and thus contain two 'species'. I personally prefer to
use the generic term 'lineage' to refer to monophyletic groups of one to
infinity members, but others reserve this term for a set of morphospecies
that reflect an unbroken anagenetic chain.
Obviously the equation used makes assumptions about prior knowledge of the
time-scales associated with clades being extant or not: if we're talking
about clades that originated a short time before the recent, the clades that
will go extinct on an infinite time-scale probably haven't had enough time
to actually go extinct. On reasonably long time-scales, however, this infinite
assumption should be reasonable approximation, as clades that survive 'forever'
in a homogenous birth-death scenario are those that get very large immediately
(similarly, most clades that go extinct also go extinct very shortly after
originating... yes, life is tough).
Both an exact and inexact (iterative) solution is offered; the exact solution
was derived in an entirely different fashion but seems to faithfully reproduce
the results of the inexact solution and is much faster. Thus, the exact
solution is the default. As it would be very simple for any user to look this up
in the code anyway, here's the unpublished equation for the exact solution:
$Ps = 1-(((p+q+r)-(sqrt(((p+q+r)^2)-(4*p*q))))/(2*p))$
References
Bapst, D. W., E. A. King and M. W. Pennell. In prep. Probability models
for branch lengths of paleontological phylogenies.
Bapst, D. W. 2013. A stochastic rate-calibrated method for time-scaling
phylogenies of fossil taxa. Methods in Ecology and Evolution.
4(8):724-733.