Learn R Programming

ape (version 5.6-2)

base.freq: Base frequencies from DNA Sequences

Description

base.freq computes the frequencies (absolute or relative) of the four DNA bases (adenine, cytosine, guanine, and thymidine) from a sample of sequences.

GC.content computes the proportion of G+C (using the previous function). All missing or unknown sites are ignored.

Ftab computes the contingency table with the absolute frequencies of the DNA bases from a pair of sequences.

Usage

base.freq(x, freq = FALSE, all = FALSE)
GC.content(x)
Ftab(x, y = NULL)

Value

A numeric vector with names c("a", "c", "g", "t") (and possibly

"r", "m", ..., a single numeric value, or a four by four matrix with similar dimnames.

Arguments

x

a vector, a matrix, or a list which contains the DNA sequences.

y

a vector with a single DNA sequence.

freq

a logical specifying whether to return the proportions (the default) or the absolute frequencies (counts).

all

a logical; by default only the counts of A, C, G, and T are returned. If all = TRUE, all counts of bases, ambiguous codes, missing data, and alignment gaps are returned.

Author

Emmanuel Paradis

Details

The base frequencies are computed over all sequences in the sample.

For Ftab, if the argument y is given then both x and y are coerced as vectors and must be of equal length. If y is not given, x must be a matrix or a list and only the two first sequences are used.

See Also

seg.sites, nuc.div (in pegas), DNAbin

Examples

Run this code
data(woodmouse)
base.freq(woodmouse)
base.freq(woodmouse, TRUE)
base.freq(woodmouse, TRUE, TRUE)
GC.content(woodmouse)
Ftab(woodmouse)
Ftab(woodmouse[1, ], woodmouse[2, ]) # same than above
Ftab(woodmouse[14:15, ]) # between the last two

Run the code above in your browser using DataLab