A phylo-HMM consisting of two states is assumed: a "conserved" state and a "non-conserved" state. If two phylogenetic models are given, the first is the conserved state, and the second is the non-conserved state. If only one model is given, then this is used as the non-conserved state, and the conserved state is obtained by multiplying the branch lengths by the parameter rho.
phastCons(msa, mod, rho = 0.3, target.coverage = 0.05,
expected.length = 10, transitions = NULL, estimate.rho = FALSE,
estimate.expected.length = FALSE, estimate.transitions = FALSE,
estimate.trees = FALSE, viterbi = TRUE, gc = NULL, nrates = NULL,
ref.idx = 1, quiet = FALSE)
An object of type msa
representing the multiple
alignment to be scored.
Either a single object of type tm
, or a list
containing two tm
objects. If two objects are given, they
represent the conserved and non-conserved phylogenetic models. If
one is given, then this represents the non-conserved model, and the
conserved model is obtained by scaling the branches by a parameter
rho.
Set the scale (overall evolutionary rate) of the model for the conserved state to be <rho> times that of the model for the non-conserved state ( 0 < rho < 1). If used with estimate.trees or estimate.rho, the specified value will be used for initialization only, and rho will be estimated. This argument is ignored if mod contains two tree model objects.
A single numeric value, representing the fraction of sites in conserved elements. This argument sets a prior expectation rather than a posterior and assumes stationarity of the state-transition process. Adding this constraint causes the ratio of between-state transitions to be fixed at (1-gamma)/gamma (where gamma is the target.coverage value).
A single numeric value, representing the parameter omega, which describes the expected length of conserved elements. This is an alternative to the transitions argument. If provided with target.coverage, than transition rates are fully determined, otherwise the target-coverage parameter will be estimated by maximum likelihood.
(Alternative to target.coverage and expected.length; ignored if
either of these are specified). A
numeric vector of length one or two, representing the
transition probabilities for the two-state HMM. The first value represents mu, the
transition rate from the conserved to the non-conserved state, and the second
value is nu, the rate from non-conserved to conserved. If only one value is
provided then mu=nu. The rate of self-transitions are then 1-mu and 1-nu, and
the expected lengths of conserved and non-conserved elements are 1/mu and 1/nu,
respectively. If estimate.transition is TRUE
, the provided values will be
used for initialization.
A logical value. If TRUE
, Estimate the parameter
rho (as described above), using maximum likelihood. Estimated value is reported
in return list. This use is discouraged (see note below).
A logical value. If TRUE
, estimate the
expected length of conserved elements by maximum likelihood, and use the
target.coverage parameter for initialization. Setting this parameter to TRUE
is discouraged (see note below).
A logical value. If TRUE
, estimate the transition
rates between conserved and non-conserved states by maximum likelihood. The parameter
transitions is then used for initialization. This argument is ignored if
estimate.expected.length==TRUE. Setting
this argument to TRUE
is discouraged (see note below).
A logical value. If TRUE
, estimate free
parameters of tree models for conserved and non-conserved state. Setting this
argument to TRUE
is discouraged (see note below).
A logical value. If TRUE
, produce discrete elements
using the Viterbi algorithm.
A single numeric value given the fraction of bases that are G or C, for optional use with estimate.trees or estimate.rho. This overrides the default behavior of estimating the base composition empirically from the data.
An integer vector of length one or two, for optional use with estimate.trees and a discrete-gamma model. Assume the specified number of rate categories, rather than the number given in the input tree model(s). If two values are given they apply to the conserved and nonconserved models, respectively.
An integer value. Use the coordinate frame of the given sequence. Default is 1, indicating the first sequence in the alignment. A value of 0 indicates the coordinate frame of the entire alignment.
If TRUE
, suppress printing of progress information.
A list containing parameter estimates. The list may have any of the following elements, depending on the arguments:
A numeric vector of length two giving the rates from the conserved to the non-conserved state, and from the non-conserved to the conserved state.
The relative evolutionary rate of the conserved state compared to the non-conserved state.
Tree model objects describing the evolutionary process in the conserved and non-conserved states.
An object of type feat
which describes conserved elements
detected by the Viterbi algorithm.
A data frame giving a coordinate and score for individual bases in the alignment
The likelihood of the data under the estimated model.
# NOT RUN {
exampleArchive <- system.file("extdata", "examples.zip", package="rphast")
files <- c("ENr334-100k.fa", "rev.mod")
unzip(exampleArchive, files)
mod <- read.tm("rev.mod")
msa <- read.msa("ENr334-100k.fa")
rv <- phastCons(msa, mod)
names(rv)
rv2 <- phastCons(msa, mod, estimate.trees=TRUE)
names(rv2)
rv2$tree.models
unlink(files)
# }
Run the code above in your browser using DataLab