Learn R Programming

phangorn (version 1.2-0)

pml: Likelihood of a tree.

Description

pml computes the likelihood of a phylogenetic tree given a sequence alignment and a model. optim.pml optimizes the different model parameters.

Usage

pml(tree, data, bf=NULL, Q=NULL, inv=0, k=1, shape=1, rate=1, model="", ...)     
optim.pml(object, optNni=FALSE, optBf=FALSE, optQ=FALSE,
    optInv=FALSE, optGamma=FALSE, optEdge=TRUE, optRate=FALSE, optRooted=FALSE, 
    control = pml.control(eps=1e-08, maxit=10, trace=1), model = NULL, subs = NULL, ...)  
pml.control(epsilon = 1e-08, maxit = 10, trace = 1)

Arguments

tree
A phylogenetic tree, object of class phylo.
data
The (DNA) alignment.
bf
Base frequencies.
Q
A vector containing the lower triangular part of the rate matrix.
inv
Proportion of invariable sites.
k
Number of intervals of the discrete gamma distribution.
shape
Shape parameter of the gamma distribution.
rate
Rate.
model
Amino acid models: one of "WAG", "JTT", "Dayhoff" or "LG" or nucleotide model
object
An object of class pml.
optNni
Logical value indicating whether toplogy gets optimized (NNI).
optBf
Logical value indicating whether base frequencies gets optimized.
optQ
Logical value indicating whether rate matrix gets optimized.
optInv
Logical value indicating whether proportion of variable size gets optimized.
optGamma
Logical value indicating whether gamma rate parameter gets optimized.
optEdge
Logical value indicating the edge lengths gets optimized.
optRate
Logical value indicating the overall rate gets optimized.
optRooted
Logical value indicating if the edge lengths of a rooted tree get optimized.
control
A list of parameters for controlling the fitting process.
subs
A (integer) vector same length as Q to specify the optimization of Q
...
Further arguments passed to or from other methods.
epsilon
Stop criterion for optimisation (see details).
maxit
Maximum number of iterations (see details).
trace
Show output during otimization (see details).

Value

  • Returns a list of class ll.phylo
  • logLikLog likelihood of the tree.
  • siteLikSite log likelihoods.
  • rootlikelihood in the root node.
  • weightWeight of the site patterns.

Details

The topology search uses a nearest neighbor interchange (NNI) and the implementation is similar to phyML. The option model in pml is only used for amino acid models. The option model defines the nucleotide model which is getting optmised, all models which are included in modeltest can be chosen. Setting this option (e.g. "K81" or "GTR") overrules options optBf and optQ. Here is a overview how to estimate different phylogenetic models with pml: lll{ model optBf optQ Jukes-Cantor FALSE FALSE F81 TRUE FALSE symmetric FALSE TRUE GTR TRUE TRUE } Via model in optim.pml the following nucleotide models can be specified: JC, F81, K80, HKY, TrNe, TrN, TPM1, K81, TPM1u, TPM2, TPM2u, TPM3, TPM3u, TIM1e, TIM1, TIM2e, TIM2, TIM3e, TIM3, TVMe, TVM, SYM and GTR. For how these s models are specified see Posada (2008). So far 4 amino acid models are supported ("WAG", "JTT", "Dayhoff" and "LG"). If the option 'getRooted' is set to TRUE than the edge lengths of rooted tree are optimized. The tree has to be rooted and ultrametric! No tree rearrangements are yet supported. If 'getRooted=FALSE' any rooted tree is getting unrooted. pml.control controls the fitting process. epsilon and maxit are only defined for the most outer loop, this affects pmlCluster, pmlPart and pmlMix. epsilon is defined as (logLik(k)-logLik(k+1))/logLik(k+1), this seems to be a good heuristics which works reasonalby for small and large trees or alignments. If trace is set to zero than no out put is shown, if functions are called internally than the trace is decreased by one.

References

Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maxumum likelihood approach. Journal of Molecular Evolution, 17, 368--376. Felsenstein, J. (2004). Inferring Phylogenies. Sinauer Associates, Sunderland. Yang, Z. (2006). Computational Molecular evolution. Oxford University Press, Oxford. Whelan, S. and Goldman, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Molecular Biology and Evolution, 18, 691--699 Le, S.Q. and Gascuel, O. (2008) LG: An Improved, General Amino-Acid Replacement Matrix Molecular Biology and Evolution 25(7), 1307--1320 Posada, D. (2008) jModelTest: Phylogenetic Model Averaging. Molecular Biology and Evolution 25: 1253-1256

Examples

Run this code
example(NJ)
# Jukes-Cantor (starting tree from NJ)  
  fitJC <- pml(tree, Laurasiatherian)  
# optimize edge length parameter     
  fitJC <- optim.pml(fitJC)
  fitJC 
  
# search for a better tree using NNI rearrangements     
  fitJC <- optim.pml(fitJC, optNni=TRUE)
  fitJC   
  plot(fitJC$tree)

# JC + Gamma + I - model
  fitJC_GI <- update(fitJC, k=4, inv=.2)
# optimize shape parameter + proportion of invariant sites     
  fitJC_GI <- optim.pml(fitJC_GI, optGamma=TRUE, optInv=TRUE)
# GTR + Gamma + I - model
  fitGTR <- optim.pml(fitJC_GI, optNni=TRUE, optGamma=TRUE, optInv=TRUE, optBf=TRUE, optQ=TRUE)

# 2-state data (RY-coded)  
    
  dat <- as.character(Laurasiatherian)
  # RY-coding
  dat[dat=="a"] <- "r"
  dat[dat=="g"] <- "r"
  dat[dat=="c"] <- "y"
  dat[dat=="t"] <- "y"
  dat <- phyDat(dat, levels=c("r","y"))
  fit2ST <- pml(tree, dat, k=4, inv=.25) 
  fit2ST <- optim.pml(fit2ST,optNni=TRUE, optGamma=TRUE, optInv=TRUE) 
  fit2ST
  # show some of the methods available for class pml
  methods(class="pml")

Run the code above in your browser using DataLab