Learn R Programming

phangorn (version 1.5-0)

pml: Likelihood of a tree.

Description

pml computes the likelihood of a phylogenetic tree given a sequence alignment and a model. optim.pml optimizes the different model parameters.

Usage

pml(tree, data, bf=NULL, Q=NULL, inv=0, k=1, shape=1, rate=1, model="", ...)     
optim.pml(object, optNni=FALSE, optBf=FALSE, optQ=FALSE,
    optInv=FALSE, optGamma=FALSE, optEdge=TRUE, optRate=FALSE, optRooted=FALSE, 
    control = pml.control(epsilon=1e-08, maxit=10, trace=1), model = NULL, subs = NULL, ...)  
pml.control(epsilon = 1e-08, maxit = 10, trace = 1)

Arguments

tree
A phylogenetic tree, object of class phylo.
data
The (DNA) alignment.
bf
Base frequencies.
Q
A vector containing the lower triangular part of the rate matrix.
inv
Proportion of invariable sites.
k
Number of intervals of the discrete gamma distribution.
shape
Shape parameter of the gamma distribution.
rate
Rate.
model
allows to choose an amino acid models or nucleotide model, see details.
object
An object of class pml.
optNni
Logical value indicating whether toplogy gets optimized (NNI).
optBf
Logical value indicating whether base frequencies gets optimized.
optQ
Logical value indicating whether rate matrix gets optimized.
optInv
Logical value indicating whether proportion of variable size gets optimized.
optGamma
Logical value indicating whether gamma rate parameter gets optimized.
optEdge
Logical value indicating the edge lengths gets optimized.
optRate
Logical value indicating the overall rate gets optimized.
optRooted
Logical value indicating if the edge lengths of a rooted tree get optimized.
control
A list of parameters for controlling the fitting process.
subs
A (integer) vector same length as Q to specify the optimization of Q
...
Further arguments passed to or from other methods.
epsilon
Stop criterion for optimisation (see details).
maxit
Maximum number of iterations (see details).
trace
Show output during otimization (see details).

Value

  • Returns a list of class ll.phylo
  • logLikLog likelihood of the tree.
  • siteLikSite log likelihoods.
  • rootlikelihood in the root node.
  • weightWeight of the site patterns.

Details

The topology search uses a nearest neighbor interchange (NNI) and the implementation is similar to phyML. The option model in pml is only used for amino acid models. The option model defines the nucleotide model which is getting optmised, all models which are included in modeltest can be chosen. Setting this option (e.g. "K81" or "GTR") overrules options optBf and optQ. Here is a overview how to estimate different phylogenetic models with pml: lll{ model optBf optQ Jukes-Cantor FALSE FALSE F81 TRUE FALSE symmetric FALSE TRUE GTR TRUE TRUE } Via model in optim.pml the following nucleotide models can be specified: JC, F81, K80, HKY, TrNe, TrN, TPM1, K81, TPM1u, TPM2, TPM2u, TPM3, TPM3u, TIM1e, TIM1, TIM2e, TIM2, TIM3e, TIM3, TVMe, TVM, SYM and GTR. These models are specified as in Posada (2008). So far 9 amino acid models are supported ("WAG", "JTT", "Dayhoff", "LG", "cpREV", "mtmam", "mtArt", "MtZoa" and "mtREV24") and additionally rate matrices and amino acid frequences can be supplied to . If the option 'getRooted' is set to TRUE than the edge lengths of rooted tree are optimized. The tree has to be rooted and ultrametric! No tree rearrangements are yet supported. If 'getRooted=FALSE' any rooted tree is getting unrooted. pml.control controls the fitting process. epsilon and maxit are only defined for the most outer loop, this affects pmlCluster, pmlPart and pmlMix. epsilon is defined as (logLik(k)-logLik(k+1))/logLik(k+1), this seems to be a good heuristics which works reasonalby for small and large trees or alignments. If trace is set to zero than no out put is shown, if functions are called internally than the trace is decreased by one.

References

Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maxumum likelihood approach. Journal of Molecular Evolution, 17, 368--376. Felsenstein, J. (2004). Inferring Phylogenies. Sinauer Associates, Sunderland. Yang, Z. (2006). Computational Molecular evolution. Oxford University Press, Oxford. Adachi, J., P. J. Waddell, W. Martin, and M. Hasegawa (2000) Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA. Journal of Molecular Evolution, 50, 348--358 Rota-Stabelli, O., Z. Yang, and M. Telford. (2009) MtZoa: a general mitochondrial amino acid substitutions model for animal evolutionary studies. Mol. Phyl. Evol, 52(1), 268--72 Whelan, S. and Goldman, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Molecular Biology and Evolution, 18, 691--699 Le, S.Q. and Gascuel, O. (2008) LG: An Improved, General Amino-Acid Replacement Matrix Molecular Biology and Evolution, 25(7), 1307--1320 Yang, Z., R. Nielsen, and M. Hasegawa (1998) Models of amino acid substitution and applications to Mitochondrial protein evolution. Molecular Biology and Evolution, 15, 1600--1611 Abascal, F., D. Posada, and R. Zardoya (2007) MtArt: A new Model of amino acid replacement for Arthropoda. Molecular Biology and Evolution, 24, 1--5 Kosiol, C, and Goldman, N (2005) Different versions of the Dayhoff rate matrix - Molecular Biology and Evolution, 22, 193--199

See Also

bootstrap.pml, pmlPart, pmlMix, plot.phylo

Examples

Run this code
example(NJ)
# Jukes-Cantor (starting tree from NJ)  
  fitJC <- pml(tree, Laurasiatherian)  
# optimize edge length parameter     
  fitJC <- optim.pml(fitJC)
  fitJC 
  
# search for a better tree using NNI rearrangements     
  fitJC <- optim.pml(fitJC, optNni=TRUE)
  fitJC   
  plot(fitJC$tree)

# JC + Gamma + I - model
  fitJC_GI <- update(fitJC, k=4, inv=.2)
# optimize shape parameter + proportion of invariant sites     
  fitJC_GI <- optim.pml(fitJC_GI, optGamma=TRUE, optInv=TRUE)
# GTR + Gamma + I - model
  fitGTR <- optim.pml(fitJC_GI, optNni=TRUE, optGamma=TRUE, optInv=TRUE, optBf=TRUE, optQ=TRUE)

# 2-state data (RY-coded)  
    
  dat <- as.character(Laurasiatherian)
# RY-coding
  dat[dat=="a"] <- "r"
  dat[dat=="g"] <- "r"
  dat[dat=="c"] <- "y"
  dat[dat=="t"] <- "y"
  dat <- phyDat(dat, type="USER", levels=c("r","y"))
  fit2ST <- pml(tree, dat, k=4, inv=.25) 
  fit2ST <- optim.pml(fit2ST,optNni=TRUE, optGamma=TRUE, optInv=TRUE) 
  fit2ST
# show some of the methods available for class pml
  methods(class="pml")

Run the code above in your browser using DataLab