SPAGeDi offers a lot of flexibility in how data files are formatted.
read.SPAGeDi
accomodates most of that flexibility. The primary
exception is that alleles must be delimited in the same way across all
genotypes, as specified by allelesep
. Comment lines beginning
with //
, as well as blank lines, are ignored by
read.SPAGeDi
just as they are by SPAGeDi.
read.SPAGeDi
is not designed to read dominant data (see section
3.2.2 of the SPAGeDi 1.3 manual). However, see
genbinary.to.genambig
for a way to read this type
of data after some simple manipulation in a spreadsheet program.
The first line of a SPAGeDi file contains information that is used by
read.SPAGeDi
. The ploidy as specified in the 6th position of the
first line is ignored, and is instead calculated by counting alleles for
each individual (including zeros on the right, but not the left, side of
the genotype). The number of digits specified in the 5th position of
the first line is only used if allelesep=""
. All other values
in the first line are important for the function.
If the only alleles found for a particular individual and locus are
zeros, the genotype is interpreted as missing. Otherwise, zeros on the
left side of a genotype are ignored, and zeros on the right side of a
genotype are used in calculating the ploidy but are not included in the
genotype object that is returned. If allelesep=""
,
read.SPAGeDi
checks that the number of characters in the genotype
can be evenly divided by the number of digits per allele. If not, zeros
are added to the left of the genotype string before splitting it into
alleles.
The Ploidies
slot of the "genambig"
object that is created
is initially indexed by both sample and locus, with ploidy being
written to the slot on a per-genotype basis. After all genotypes have
been imported, reformatPloidies
is used to convert
Ploidies
to the simplest possible format before the object is returned.