phyloP: phyloP (basewise or by feature)

Description

Conservation/acceleration p-values on an alignment and evolutionary model. Produces scores for every column in an alignment, or for every element in a set of features.

Usage

phyloP(mod, msa, method = "LRT", mode = "CON", features = NULL,
  subtree = NULL, branches = NULL, ref.idx = 1, outfile = NULL,
  outfile.only = FALSE, outfile.format = "default")

Arguments

mod

An object of class tm representing the neutral model.

msa

The multiple alignment to be scored.

method

The scoring method. One of "SPH", "LRT", "SCORE", or "GERP".

mode

The type of p-value to compute. One of "CON", "ACC", "NNEUT", or "CONACC".

features

An object of type feat. If given, compute p-values for every feature.

subtree

A character string giving the name of a node in the tree. Partition the tree into the subtree beneath the node and the complementary supertree, and consider conservation or acceleration in the subtree given the supertree. The branch above the specified node is included with the subtree.

branches

A vector of character strings giving the names of branches to consider in the subtree. The remaining branches are considered part of the supertree, and the test considers conservation or acceleration in the subtree relative to the supertree. This option is currently only available for method="LRT" or "SCORE".

ref.idx

index of reference sequence in the alignment. If zero, use frame of reference of entire alignment. If ref.idx==-1 and features are provided, try to guess the frame of reference of each individual feature based on sequence name.

outfile

Character string. If given, write results to given file.

outfile.only

Logical. If TRUE, do not return any results to R (this may be useful if results are very large).

outfile.format

Character string describing format of file output. Possible formats depend on other options (see description below). Current options are "default", "gff", or "wig".

Value

A data frame containing scores and parameter estimates for every feature (if features is given) or for every base (otherwise).

Details

outfile.format options:

If features is provided, then outfile.format can be either "default" or "gff". If it is "default", then the outfile will be a table in zero-based coordinates, which includes start and end coordinates, feature name, parameter estimates, and p-values. If outfile.format is "gff", then the output file will be a GFF file (in 1-based coordinates) with a score equal to the -log10 p-value for each element.

If features is not provided, then outfile.format can be either "default" or "wig". In either case the outfile will be in fixed step wig format (see http://genome.ucsc.edu/goldenPath/help/wiggle.html). If format is "default", then each row (corresponding to one alignment column) will contain several values, such as parameter estimates and p-values for that column. If outfile.format is "wig", then the output file will be in strict wig format, with a single value per line indicating the -log10 p-value.

Examples

Run this code

# NOT RUN {
exampleArchive <- system.file("extdata", "examples.zip", package="rphast")
files <- c("ENr334-100k.fa", "gencode.ENr334-100k.gff", "rev.mod")
unzip(exampleArchive, files)
tm <- read.tm("rev.mod")
tm$tree <- name.ancestors(tm$tree)
msa <- read.msa("ENr334-100k.fa", offset=41405894)
phyloP(tm, msa, method="LRT", outfile="test.out", outfile.only=TRUE, outfile.format="wig")
t1 <- phyloP(tm, msa, method="LRT", outfile="test.out")
t2 <- phyloP(tm, msa, method="LRT", outfile="test.out", outfile.format="wig")
t1 <- phyloP(tm, msa, method="SPH")
f <- read.feat("gencode.ENr334-100k.gff")
t1 <- phyloP(tm, msa, method="LRT", outfile="test.out", features=f)
t2 <- phyloP(tm, msa, method="LRT", features=f,
             outfile="test.out", outfile.format="gff", outfile.only=TRUE)
unlink("test.out")
unlink(files)
# }

Run the code above in your browser using DataLab