add.ls.mod: Add a lineage-specific model

Description

Lineage-specific models allow a different substitution model to be defined on a specified set of branches. An entirely different substitution model can be used, as long as it is of the same order and has the same number of states as the model used in the rest of the tree. Or, if the same substitution model is used, certain parameters can be optimized separately from the main model, whereas others are shared with the main model.

Usage

add.ls.mod(x, branch = NULL, label = NULL, category = 0,
  subst.mod = NULL, separate.params = NULL, const.params = NULL,
  backgd = NULL, selection = NULL, bgc = NULL)

Arguments

An object of type tm

branch

If the lineage-specific model applies to a single branch, it can be named here using the name of the node which descendes from the branch. See name.ancestors for naming internal nodes.

label

(Alternative to branch). The label which identifies the branch(es) which this lineage-specific model should apply to. Labels are denoted in a tree with a pound sign and label following the node. See label.branches and label.subtree to add a label to a tree.

category

An integer indicating which category/categories to apply the lineage-specific model. This only works if x$nratecats > 1. A value of 0 or NULL implies all categories. Otherwise this can be an integer (or vector of integers) from 1..x$nratecats.

subst.mod

A character string indicating the substitution model to be used for the lineage-specific model. If NULL, use the same model as the rest of the tree. See subst.mods for a list of possible substituion models.

separate.params

(Only applies if subst.mod is the same as main model) A vector of character strings indicating which parameters to estimate separately. Possible values are "kappa", "sel", "bgc", "gap_param", "backgd", and "ratematrix". If backgd, selection, or bgc are provided as arguments, they are automatically considered separate parameters and do not need to be explicitly listed here. "ratematrix" implies all parameters describing the substitution model (but does not include backgd, sel, or bgc). Boundaries can be optionally appended to parameter names with brackets, ie, "kappa[1,10]" will set boundaries for kappa between 1 and 10 (see "Parameter boundaries" section of phyloFit). If subst.mod is different from the main model, then no parameters are shared with main model. However the equilibrium frequencies can be shared by setting backgd to NULL.

const.params

A character vector indicating which parameters to hold constant at their initial values, rather than being optimized upon a call to phyloFit. Possible values are the same as for separate.params, although no boundaries can be given here.

backgd

The initial equilibrium frequencies to use for this model. If NULL, use the same as in the main model.

selection

The selection parameter (from the sel+bgc model), relative to selection in the main model.

bgc

The bgc parameter (from the sel+bgc model).

Value

An object of type tm, identical to the input model but with a new lineage-specific model added on. This lineage-specific model is not validated by this function.

Details

A lineage-specific model is stored as a list with the following elements: defn, rate.matrix, and optional elements backgd, selection, bgc.

defn is a character string which defines the model in a way that phast can parse; it is a colon-delimited string with 2 or 3 elements. The first element indicates which branches the model applies to, the second indicates which substitution model to use or which parameters to optimize if the same substitution model is used (and also may impose boundaries on these parameters). The optional third element is a list of parameters which will not be optimized by phyloFit.

backgd is the initial set of equilibrium frequencies for this model; if not present, then the equilibrium frequencies will be shared with the main model.

selection and bgc are optional parameters for the model with biased gene conversion and selection. If they are not provided this model is not used. Note that selection is defined relative to selection in the main model, if x$selection is not NULL (so the total selection in the lineage-specific model is the sum of the selection value in the main and lineage-specific model.

A tree model can have multiple lineage-specific models; if a later model applies to the same branch as an earlier model, then the later one overrides it.

All lineage-specific models are stored in the ls.model element of the tm object.