Learn R Programming

ape (version 4.0)

base.freq: Base frequencies from DNA Sequences

Description

base.freq computes the frequencies (absolute or relative) of the four DNA bases (adenine, cytosine, guanine, and thymidine) from a sample of sequences.

GC.content computes the proportion of G+C (using the previous function). All missing or unknown sites are ignored.

Ftab computes the contingency table with the absolute frequencies of the DNA bases from a pair of sequences.

Usage

base.freq(x, freq = FALSE, all = FALSE) GC.content(x) Ftab(x, y = NULL)

Arguments

x
a vector, a matrix, or a list which contains the DNA sequences.
y
a vector with a single DNA sequence.
freq
a logical specifying whether to return the proportions (the default) or the absolute frequencies (counts).
all
a logical; by default only the counts of A, C, G, and T are returned. If all = TRUE, all counts of bases, ambiguous codes, missing data, and alignment gaps are returned.

Value

A numeric vector with names c("a", "c", "g", "t") (and possibly "r", "m", ..., a single numeric value, or a four by four matrix with similar dimnames.

Details

The base frequencies are computed over all sequences in the sample.

For Ftab, if the argument y is given then both x and y are coerced as vectors and must be of equal length. If y is not given, x must be a matrix or a list and only the two first sequences are used.

See Also

seg.sites, nuc.div, DNAbin

Examples

Run this code
data(woodmouse)
base.freq(woodmouse)
base.freq(woodmouse, TRUE)
base.freq(woodmouse, TRUE, TRUE)
GC.content(woodmouse)
Ftab(woodmouse)
Ftab(woodmouse[1, ], woodmouse[2, ]) # same than above
Ftab(woodmouse[14:15, ]) # between the last two

Run the code above in your browser using DataLab