read.nexus.data reads a file with sequences in the NEXUS
format. nexus2DNAbin is a helper function to convert the output
from the previous function into the class "DNAbin".
For the moment, only sequence data (DNA or protein) are supported.
read.nexus.data(file)
nexus2DNAbin(x)A list of sequences each made of a single vector of mode character where each element is a (phylogenetic) character state.
a file name specified by either a variable of mode character, or a double-quoted string.
an object output by read.nexus.data.
Johan Nylander, Thomas Guillerme, and Klaus Schliep
This parser tries to read data from a file written in a restricted NEXUS format (see examples below).
Please see files data.nex and taxacharacters.nex for
examples of formats that will work.
Some noticeable exceptions from the NEXUS standard (non-exhaustive list):
IComments must be either on separate lines or at the
end of lines. Examples:
[Comment] --- OK
Taxon ACGTACG [Comment] --- OK
[Comment line 1 Comment line 2] --- NOT OK!
Tax[Comment]on ACG[Comment]T --- NOT OK!
IINo spaces (or comments) are allowed in the
sequences. Examples:
name ACGT --- OK
name AC GT --- NOT OK!
IIINo spaces are allowed in taxon names, not even if
names are in single quotes. That is, single-quoted names are not
treated as such by the parser. Examples:
Genus_species --- OK
'Genus_species' --- OK
'Genus species' --- NOT OK!
IVThe trailing end that closes the
matrix must be on a separate line. Examples:
taxon AACCGGT end; --- OK
taxon AACCGGT; end; --- OK
taxon AACCCGT; end; --- NOT OK!
VMultistate characters are not allowed. That is,
NEXUS allows you to specify multiple character states at a
character position either as an uncertainty, (XY), or as an
actual appearance of multiple states, {XY}. This is
information is not handled by the parser. Examples:
taxon 0011?110 --- OK
taxon 0011{01}110 --- NOT OK!
taxon 0011(01)110 --- NOT OK!
VIThe number of taxa must be on the same line as
ntax. The same applies to nchar. Examples:
ntax = 12 --- OK
ntax = 12 --- NOT OK!
VIIThe word “matrix” can not occur anywhere in
the file before the actual matrix command, unless it is in
a comment. Examples:
BEGIN CHARACTERS; TITLE 'Data in file "03a-cytochromeB.nex"'; DIMENSIONS NCHAR=382; FORMAT DATATYPE=Protein GAP=- MISSING=?; ["This is The Matrix"] --- OK MATRIX
BEGIN CHARACTERS; TITLE 'Matrix in file "03a-cytochromeB.nex"'; --- NOT OK! DIMENSIONS NCHAR=382; FORMAT DATATYPE=Protein GAP=- MISSING=?; MATRIX
Maddison, D. R., Swofford, D. L. and Maddison, W. P. (1997) NEXUS: an extensible file format for systematic information. Systematic Biology, 46, 590--621.
read.nexus, write.nexus,
write.nexus.data
## Use read.nexus.data to read a file in NEXUS format into object x
if (FALSE) x <- read.nexus.data("file.nex")
Run the code above in your browser using DataLab