Learn R Programming

paleotree (version 2.5)

timePaleoPhy: Timescaling of Paleo-Phylogenies

Description

Timescales an unscaled cladogram of fossil taxa using information on their temporal ranges, using various methods. Also can resolve polytomies randomly and output samples of randomly-resolved trees. As simple methods of time-scaling phylogenies of fossil taxa can have biasing effects on macroevolutionary analyses (Bapst, 2014, Paleobiology), this function is largely retained for legacy purposes and plotting applications. The time-scaling methods implemented by the functions listed here do not return realistic estimates of divergence dates, users should investigate other time-scaling methods such as cal3TimePaleoPhy.

Usage

timePaleoPhy(tree, timeData, type = "basic", vartime = NULL, ntrees = 1,
  randres = FALSE, timeres = FALSE, add.term = FALSE,
  inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL,
  noisyDrop = TRUE, plot = FALSE)

bin_timePaleoPhy(tree, timeList, type = "basic", vartime = NULL,
  ntrees = 1, nonstoch.bin = FALSE, randres = FALSE, timeres = FALSE,
  sites = NULL, point.occur = FALSE, add.term = FALSE,
  inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL,
  noisyDrop = TRUE, plot = FALSE)

Arguments

tree
An unscaled cladogram of fossil taxa, of class 'phylo'. Tip labels must match the taxon labels in the respective temporal data.
timeData
Two-column matrix of first and last occurrences in absolute continuous time, with row names as the taxon IDs used on the tree. This means the first column is very precise FADs (first appearance dates) and the second column is very precise LADs (last ap
type
Type of time-scaling method used. Can be "basic", "equal", "equal_paleotree_legacy", "equal_date.phylo_legacy" "aba", "zbla" or "mbl". Type="basic" by default. See details below.
vartime
Time variable; usage depends on the method 'type' argument. Ignored if type = "basic".
ntrees
Number of time-scaled trees to output. If ntrees is greater than one and both randres is false and dateTreatment is neither 'minMax' or 'randObs', the function will fail and a warning is issued, as these arguments would simply produce multiple identic
randres
Should polytomies be randomly resolved? By default, timePaleoPhy does not resolve polytomies, instead outputting a time-scaled tree that is only as resolved as the input tree. If randres=T, then polytomies will be randomly resolved using
timeres
Should polytomies be resolved relative to the order of appearance of lineages? By default, timePaleoPhy does not resolve polytomies, instead outputting a time-scaled tree that is only as resolved as the input tree. If timeres=T, then polytomies will be
add.term
If true, adds terminal ranges.By default, this function will not add the ranges of taxa when time-scaling a tree, so that the tips correspond temporally to the first appearance datums of the given taxa. If add.term=T, then the 'terminal ranges' of the
inc.term.adj
If true, includes terminal ranges in branch length estimates for the various adjustment of branch lengths under all methods except 'basic' (i.e. a terminal length branch will not be treated as zero length is this argument is TRUE if the taxon at this t
dateTreatment
This argument controls the interpretation of timeData. The default setting 'firstLast' treats the dates in timeData as a column of precise first and last appearances, such that first appearances will be used to date nodes and last appearances will only
node.mins
The minimum dates of internal nodes (clades) on a phylogeny can be set using node.mins. This argument takes a vector of the same length as the number of nodes, with dates given in the same order as nodes are ordered in the tree$edge matrix.
noisyDrop
If TRUE (the default), any taxa dropped from tree due to not having a matching entry in the time data will be listed in a system message.
plot
If true, plots the input and output phylogenies.
timeList
A list composed of two matrices giving interval times and taxon appearance dates. The rownames of the second matrix should be the taxon IDs, identical to the tip.labels for tree. See details.
nonstoch.bin
If true, dates are not stochastically pulled from uniform distributions. See below for more details.
sites
Optional two column matrix, composed of site IDs for taxon FADs and LADs. The sites argument allows users to constrain the placement of dates by restricting multiple fossil taxa whose FADs or LADs are from the same very temporally restricted sites (suc
point.occur
If true, will automatically produce a 'sites' matrix which forces all FADs and LADs to equal each other. This should be used when all taxa are only known from single 'point occurrences', i.e. each is only recovered from a single bed/horizon, such as a

Value

  • The output of these functions is a time-scaled tree or set of time-scaled trees, of either class phylo or multiphylo, depending on the argument ntrees. All trees are output with an element $root.time. This is the time of the root on the tree and is important for comparing patterns across trees. Note that the $root.time element is defined relative to the earliest first appearance date, and thus later tips may seem to occur in the distant future under the 'aba' and 'zbla' time-scaling methods. Trees created with bin_timePaleoPhy will output with some additional elements, in particular $ranges.used, a matrix which records the continuous-time ranges generated for time-scaling each tree. (Essentially a pseudo-timeData matrix.)

Details

Time-Scaling Methods These functions are an attempt to unify and collect previously used and discussed methods for time-scaling phylogenies of fossil taxa. Unfortunately, it can be difficult to attribute some time-scaling methods to specific references in the literature. There are five main method types that can be used by timePaleoPhy. Four of these main types use some value of absolute time, chosen a priori, to time-scale the tree. This is handled by the argument vartime, which is NULL by default and unused for type "basic". [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object] These functions cannot time-scale branches relative to reconstructed character changes along branches, as used by Lloyd et al. (2012). Please see DatePhylo in R package {strap} for this functionality. These functions will intuitively drop taxa from the tree with NA for range or are missing from timeData or timeList. Taxa dropped from the tree will be will be listed in a message output to the user. The same is done for taxa in the timeList object not listed in the tree.. As with many functions in the paleotree library, absolute time is always decreasing, i.e. the present day is zero. As of August 2014, please note that the branch-ordering algorithm used in 'equal' has changed to match the current algorithm used by DatePhylo in package strap, and that two legacy versions of 'equal' have been added to this function, respectively representing how timePaleoPhy and DatePhylo (and its predecessor date.phylo) applied the 'equal' time-scaling method. Interpretation of Taxon Ages in timePaleoPhy timePaleoPhy is primarily designed for direct application to datasets where taxon first and last appearances are precisely known in continuous time, with no stratigraphic uncertainty. This is an uncommon form of data to have from the fossil record, although not an impossible form (micropaleontologists often have very precise range charts, for example). Instead, most data has some form of stratigraphic uncertainty. However, for some groups, the more typical 'first' and 'last' dates found in the literature or in databases represent the minimum and maximum absolute ages for the fossil collections that a taxon is known is known from. Presumably, the first and last appearances of that taxon in the fossil record is at unknown dates within these bounds. As of paleotree version 2.0. the treatment of taxon ages in timePaleoPhy is handled by the argument dateTreatment. By default, this argument is set to 'firstLast' which means the matrix of ages are treated as precise first and last appearance dates (i.e. FADs and LADs). The earlier FADs will be used to calibrate the node ages, which could produce fairly nonsensical results if these are 'minimum' ages instead and reflect age uncertainty. Alternatively, dateTreatment can be set to 'minMax' which instead treats taxon age data as minimum and maximum bounds on a single point date. These point dates, if the minimum and maximum bounds option is selected, are chose under a uniform distribution. Many time-scaled trees should be created to approximate the uncertainty in the dates. Additionally, there is a third option for dateTreatment: users may also make it so that the 'times of observation' of trees are uncertain, such that the tips of the tree (with terminal ranges added) should be randomly selected from a uniform distribution. Essentially, this third option treats the dates as first and last appearances, but treats the first appearance dates as known and fixed, but the 'last appearance' dates as unknown. In previous versions of paleotree, this third option was enacted with the argument rand.obs, which has been removed for clarity. Interpretation of Taxon Ages in bin_timePaleoPhy As an alternative to using timePaleoPhy, bin_timePaleoPhy is a wrapper of timePaleoPhy which produces timescaled trees for datasets which only have interval data available. For each output tree, taxon first and last appearance dates are placed within their listed intervals under a uniform distribution. Thus, a large sample of time-scaled trees will approximate the uncertainty in the actual timing of the FADs and LADs. In some ways, treating taxonomic age uncertainty may be more logical via bin_timePaleoPhy, as it is tied to specific interval bounds, and there are more options available for certain types of age uncertainty, such as for cases where specimens come from the same fossil site. The input timeList object for bin_timePaleoPhy can have overlapping (i.e. non-sequential) intervals, and intervals of uneven size. Taxa alive in the modern should be listed as last occurring in a time interval that begins at time 0 and ends at time 0. If taxa occur only in single collections (i.e. their first and last appearance in the fossil record is synchronous, the argument point.occur will force all taxa to have instantaneous durations in the fossil record. Otherwise, by default, taxa are assumed to first and last appear in the fossil record at different points in time, with some positive duration. The sites matrix can be used to force only a portion of taxa to have simultaneous first and last appearances. By setting the argument nonstoch.bin to TRUE for bin_timePaleoPhy, the dates are NOT stochastically pulled from uniform bins but instead FADs are assigned to the earliest time of whichever interval they were placed in and LADs are placed at the most recent time in their placed interval. This option may be useful for plotting. The sites argument becomes arbitrary if nonstoch.bin is TRUE. If timeData or the elements of timeList are actually data.frames (as output by read.csv or read.table), these will be coerced to a matrix. Tutorial A tutorial for applying the time-scaling functions in paleotree, along with an example using real (graptolite) data, can be found here: http://nemagraptus.blogspot.com/2013/06/a-tutorial-to-cal3-time-scaling-using.html

References

Bapst, D. W. 2013. A stochastic rate-calibrated method for time-scaling phylogenies of fossil taxa. Methods in Ecology and Evolution. 4(8):724-733. Bapst, D. W. 2014. Assessing the effect of time-scaling methods on phylogeny-based analyses in the fossil record. Paleobiology 40(3):331-351. Brusatte, S. L., M. J. Benton, M. Ruta, and G. T. Lloyd. 2008 Superiority, Competition, and Opportunism in the Evolutionary Radiation of Dinosaurs. Science 321(5895):1485-91488. Hunt, G., and M. T. Carrano. 2010 Models and methods for analyzing phenotypic evolution in lineages and clades. In J. Alroy, and G. Hunt, eds. Short Course on Quantitative Methods in Paleobiology. Paleontological Society. Laurin, M. 2004. The Evolution of Body Size, Cope's Rule and the Origin of Amniotes. Systematic Biology 53(4):594-622. Lloyd, G. T., S. C. Wang, and S. L. Brusatte. 2012 Identifying Heterogeneity in Rates of Morphological Evolutio: Discrete Character Change in the Evolution of Lungfish(Sarcopterygii, Dipnoi). Evolution 66(2):330--348. Smith, A. B. 1994 Systematics and the fossil record: documenting evolutionary patterns. Blackwell Scientific, Oxford.

See Also

cal3TimePaleoPhy, binTimeData, multi2di For an alternative time-scaling function, which includes the 'ruta' method that weights the time-scaling of branches by estimates of character change along with implementations of the 'basic' and 'equal' methods described here, please see function code/{DatePhylo} in package code/{strap}.

Examples

Run this code
# examples with empirical data

#load data
data(retiolitinae)

#Can plot the unscaled cladogram
plot(retioTree)
#Can plot discrete time interval diversity curve with retioRanges
taxicDivDisc(retioRanges)

#Use basic time-scaling (terminal branches only go to FADs)
ttree<-bin_timePaleoPhy(tree=retioTree,timeList=retioRanges,type="basic",
	ntrees=1, plot=TRUE)

#Use basic time-scaling (terminal branches go to LADs)
ttree<-bin_timePaleoPhy(tree=retioTree,timeList=retioRanges,type="basic",
	add.term=TRUE, ntrees=1, plot=TRUE)

#mininum branch length time-scaling (terminal branches only go to FADs)
ttree<-bin_timePaleoPhy(tree=retioTree,timeList=retioRanges,type="mbl",
	vartime=1, ntrees=1, plot=TRUE)

###################

# examples with simulated data

#Simulate some fossil ranges with simFossilTaxa
set.seed(444)
taxa <- simFossilTaxa(p=0.1,q=0.1,nruns=1,mintaxa=20,maxtaxa=30,maxtime=1000,maxExtant=0)
#simulate a fossil record with imperfect sampling with sampleRanges
rangesCont <- sampleRanges(taxa,r=0.5)
#let's use taxa2cladogram to get the 'ideal' cladogram of the taxa
cladogram <- taxa2cladogram(taxa,plot=TRUE)
#Now let's try timePaleoPhy using the continuous range data
ttree <- timePaleoPhy(cladogram,rangesCont,type="basic",plot=TRUE)
#plot diversity curve
phyloDiv(ttree)

#that tree lacked the terminal parts of ranges (tips stops at the taxon FADs)
#let's add those terminal ranges back on with add.term
ttree <- timePaleoPhy(cladogram,rangesCont,type="basic",add.term=TRUE,plot=TRUE)
#plot diversity curve
phyloDiv(ttree)

#that tree didn't look very resolved, does it? (See Wagner and Erwin 1995 to see why)
#can randomly resolve trees using the argument randres
#each resulting tree will have polytomies randomly resolved in different ways using multi2di
ttree <- timePaleoPhy(cladogram,rangesCont,type="basic",ntrees=1,randres=TRUE,
    add.term=TRUE,plot=TRUE)
#notice well the warning it prints!
#we would need to set ntrees to a large number to get a fair sample of trees

#if we set ntrees>1, timePaleoPhy will make multiple time-trees
ttrees <- timePaleoPhy(cladogram,rangesCont,type="basic",ntrees=9,randres=TRUE,
    add.term=TRUE,plot=TRUE)
#let's compare nine of them at once in a plot
layout(matrix(1:9,3,3))
parOrig <- par(no.readonly=TRUE)
par(mar=c(1,1,1,1))
for(i in 1:9){plot(ladderize(ttrees[[i]]),show.tip.label=FALSE,no.margin=TRUE)}
#they are all a bit different!

#we can also resolve the polytomies in the tree according to time of first appearance
	#via the function timeLadderTree, by setting the argument 'timeres' to TRUE
ttree <- timePaleoPhy(cladogram,rangesCont,type="basic",ntrees=1,timeres=TRUE,
    add.term=TRUE,plot=TRUE)

#can plot the median diversity curve with multiDiv
layout(1);par(parOrig)
multiDiv(ttrees)

#compare different methods of timePaleoPhy
layout(matrix(1:6,3,2))
parOrig <- par(no.readonly=TRUE)
par(mar=c(3,2,1,2))
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="basic",vartime=NULL,add.term=TRUE)))
    axisPhylo();text(x=50,y=23,"type=basic",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="equal",vartime=10,add.term=TRUE)))
    axisPhylo();text(x=55,y=23,"type=equal",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="aba",vartime=1,add.term=TRUE)))
    axisPhylo();text(x=55,y=23,"type=aba",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="zlba",vartime=1,add.term=TRUE)))
    axisPhylo();text(x=55,y=23,"type=zlba",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="mbl",vartime=1,add.term=TRUE)))
    axisPhylo();text(x=55,y=23,"type=mbl",adj=c(0,0.5),cex=1.2)
layout(1);par(parOrig)

#using node.mins
#let's say we have (molecular??) evidence that node #5 is at least 1200 time-units ago
#to use node.mins, first need to drop any unshared taxa
droppers <- cladogram$tip.label[is.na(
      match(cladogram$tip.label,names(which(!is.na(rangesCont[,1])))))]
cladoDrop <- drop.tip(cladogram, droppers)
# now make vector same length as number of nodes
nodeDates <- rep(NA, Nnode(cladoDrop))
nodeDates[5] <- 1200
ttree1 <- timePaleoPhy(cladoDrop,rangesCont,type="basic",
  	randres=FALSE,node.mins=nodeDates,plot=TRUE)
ttree2 <- timePaleoPhy(cladoDrop,rangesCont,type="basic",
   	randres=TRUE,node.mins=nodeDates,plot=TRUE)

#Using bin_timePaleoPhy to timescale with discrete interval data
#first let's use binTimeData() to bin in intervals of 1 time unit
rangesDisc <- binTimeData(rangesCont,int.length=1)
ttreeB1 <- bin_timePaleoPhy(cladogram,rangesDisc,type="basic",ntrees=1,randres=TRUE,
    add.term=TRUE,plot=FALSE)
#notice the warning it prints!
phyloDiv(ttreeB1)
#with time-order resolving via timeLadderTree
ttreeB2 <- bin_timePaleoPhy(cladogram,rangesDisc,type="basic",ntrees=1,timeres=TRUE,
    add.term=TRUE,plot=FALSE)
phyloDiv(ttreeB2)
#can also force the appearance timings not to be chosen stochastically
ttreeB3 <- bin_timePaleoPhy(cladogram,rangesDisc,type="basic",ntrees=1,
    nonstoch.bin=TRUE,randres=TRUE,add.term=TRUE,plot=FALSE)
phyloDiv(ttreeB3)

#simple three taxon example for testing inc.term.adj
ranges1<-cbind(c(3,4,5),c(2,3,1));rownames(ranges1)<-paste("t",1:3,sep="")
clado1<-read.tree(file=NA,text="(t1,(t2,t3));")
ttree1<-timePaleoPhy(clado1,ranges1,type="mbl",vartime=1)
ttree2<-timePaleoPhy(clado1,ranges1,type="mbl",vartime=1,add.term=TRUE)
ttree3<-timePaleoPhy(clado1,ranges1,type="mbl",vartime=1,add.term=TRUE,inc.term.adj=TRUE)
layout(1:3)
ttree1$root.time;plot(ttree1);axisPhylo()
ttree2$root.time;plot(ttree2);axisPhylo()
ttree3$root.time;plot(ttree3);axisPhylo()
-apply(ranges1,1,diff)

Run the code above in your browser using DataLab