Learn R Programming

HyPhy (version 1.0)

recon.score: Duplications and losses for gene tree in a species tree

Description

Calculates the minimum number of gene duplications and gene losses necessary to reconcile a gene tree with a species tree

Usage

recon.score(phy, phy.sub, reconcile = NULL)

Arguments

phy
The species tree to be reconciled of class “phylo
phy.sub
The gene tree(s) to be reconciled in class “phylo” or “multiPhylo
reconcile
A description of the relationship between the gene tree and the species tree: either NULL; a vector of positive integers of length length(phy.sub$tip.label); or a vector of integers and NA of length max(phy.sub$edge). (See details)

Value

If class(phy.sub)=="phylo", the output is a vector with two elements, the number of duplications and the number of losses. If class(phy.sub)=="multiPhylo", the output is a matrix with two colums, the number of duplications and the number of losses, and every row representing a different tree in phy.sub..

Details

reconcile is a vector in which the ith element of the vector gives the position on phy of the node of the gene tree labeled i in phy$sub.edge. A positive n value places the ith ph.sub node on the node of the species tree labeled n in phy$edge, a negative n value places the ith ph.sub node on the branch of the species tree defined by the nth row in phy$edge, and a 0 places the ith ph.sub node on the root of the species tree. Elements for which is.na(reconcile[i]) or length(reconcile) are undefined, so if reconcile=NULL, then all elements are undefined. The first length(phy.sub$tip.label) elements of reconcile refer to the tips of phy.sub; if the ith element is undefined, then the function will place the ith tip of phy.sub in a tip of phy, such that phy$tip.label==phy.sub$tip.label[i]. The remaining elements of reconcile refer to the internal nodes of phy.sub; if the ith element is undefined, then the function will place the ith node of phy.sub at its maximum parsimony position given the position of the two nodes above it.

To make a long story short, if the labels in phy.sub$tip.label match the labels in phy$tip.label, as they would for trees produced by rgenetree, and you want a maximum parsimony reconciliation, then set reconcile=NULL. On the other hand, if the labels do not match and you want a maximum parsimony reconciliation, then reconcile should be a vector of positive integers assigning the tips of phy.sub to the tips of phy. Finally, if you want a non maximum parsimony reconciliation, then reconcile should be of length max(phy$edge), and the nodes of sub.phy should be assigned to branches of phy as described above. Be warned that any internal node of sub.phy can only be assigned to certain nodes and branches of phy, and there is no mechanism to check the logical consistancy of reconcile, so make sure that you assign nodes to appropriate branches. You can always leave a node as NA, and it will end up in its maximum parsimony position.

References

M. Goodman, J. Czelusniak, G. Moore, A. Romero-Herrera, G. Matsuda, Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences, Syst. Zool. 28 (1979) 132-163.

See Also

rgenetree,plot.recon(coming soon)

Examples

Run this code
##First we need a simple species tree
spec<-read.tree(text="((A:0.5,B:0.5):0.5,C:1);")
##Now let's simulate a bunch of gene family trees
genes<-rgenetree(10,spec,0.5,0.5,3,10,TRUE)
##Let's look at those trees
##Note that all their tips are labeled A, B or C, just like spec
plot(genes)
##Therefore we can calculate the counts for all trees without any other info
recon.score(spec,genes)
##On the other hand, if we make our own gene tree with different labels 
gene<-read.tree(text="((A1,(A2,B1)),(B2,(C1,C2)));")
##We must generate a reconcile vector
##to do so we must know the positions of the tip labels in both phylogenies
spec$tip.label
gene$tip.label
reconcile<-c(1,1,2,2,3,3)
recon.score(spec,gene,reconcile)
##To force the node at the base of A2 and B1 down to the branch below A and B
##we must look at both edge matrices to learn how each node and branch are labeled
spec$edge
gene$edge
reconcile<-c(1,1,2,2,3,3,NA,NA,-1,NA,NA)
recon.score(spec,gene,reconcile)

Run the code above in your browser using DataLab