Learn R Programming

ape (version 4.0)

DNAbin: Manipulate DNA Sequences in Bit-Level Format

Description

These functions help to manipulate DNA sequences coded in the bit-level coding scheme.

Usage

"print"(x, printlen = 6, digits = 3, ...) "rbind"(...) "cbind"(..., check.names = TRUE, fill.with.gaps = FALSE, quiet = FALSE) "["(x, i, j, drop = FALSE) "as.matrix"(x, ...) "c"(..., recursive = FALSE) "as.list"(x, ...) "labels"(object, ...)

Arguments

x, object
an object of class "DNAbin".
...
either further arguments to be passed to or from other methods in the case of print, as.matrix, and labels, or a series of objects of class "DNAbin" in the case of rbind, cbind, and c.
printlen
the number of labels to print (6 by default).
digits
the number of digits to print (3 by default).
check.names
a logical specifying whether to check the rownames before binding the columns (see details).
fill.with.gaps
a logical indicating whether to keep all possible individuals as indicating by the rownames, and eventually filling the missing data with insertion gaps (ignored if check.names = FALSE).
quiet
a logical to switch off warning messages when some rows are dropped.
i, j
indices of the rows and/or columns to select or to drop. They may be numeric, logical, or character (in the same way than for standard R objects).
drop
logical; if TRUE, the returned object is of the lowest possible dimension.
recursive
for compatibility with the generic (unused).

Value

an object of class "DNAbin" in the case of rbind, cbind, and [.

Details

These are all `methods' of generic functions which are here applied to DNA sequences stored as objects of class "DNAbin". They are used in the same way than the standard R functions to manipulate vectors, matrices, and lists. Additionally, the operators [[ and $ may be used to extract a vector from a list. Note that the default of drop is not the same than the generic operator: this is to avoid dropping rownames when selecting a single sequence.

These functions are provided to manipulate easily DNA sequences coded with the bit-level coding scheme. The latter allows much faster comparisons of sequences, as well as storing them in less memory compared to the format used before ape 1.10.

For cbind, the default behaviour is to keep only individuals (as indicated by the rownames) for which there are no missing data. If fill.with.gaps = TRUE, a `complete' matrix is returned, enventually with insertion gaps as missing data. If check.names = TRUE (the default), the rownames of each matrix are checked, and the rows are reordered if necessary. If check.names = FALSE, the matrices must all have the same number of rows, and are simply binded; the rownames of the first matrix are used. See the examples.

as.matrix may be used to convert DNA sequences (of the same length) stored in a list into a matrix while keeping the names and the class. as.list does the reverse operation.

References

Paradis, E. (2007) A Bit-Level Coding Scheme for Nucleotides. http://ape-package.ird.fr/misc/BitLevelCodingScheme_20April2007.pdf

Paradis, E. (2012) Analysis of Phylogenetics and Evolution with R (Second Edition). New York: Springer.

See Also

as.DNAbin, read.dna, read.GenBank, write.dna, image.DNAbin,AAbin

The corresponding generic functions are documented in the package base.

Examples

Run this code
data(woodmouse)
woodmouse
print(woodmouse, 15, 6)
print(woodmouse[1:5, 1:300], 15, 6)
### Just to show how distances could be influenced by sampling:
dist.dna(woodmouse[1:2, ])
dist.dna(woodmouse[1:3, ])
### cbind and its options:
x <- woodmouse[1:2, 1:5]
y <- woodmouse[2:4, 6:10]
as.character(cbind(x, y)) # gives warning
as.character(cbind(x, y, fill.with.gaps = TRUE))
## Not run: 
# as.character(cbind(x, y, check.names = FALSE)) # gives an error
# ## End(Not run)

Run the code above in your browser using DataLab