Read an Applied Biosystems Gene Mapper (ABI) output file, and prepare for analysis.
Note that this operates on the summarised output (a text file), rather
than the .fsa
files containing data for individual runs.
read.abi(file)
The name of the file from which the data are to be read.
The ABI file format contains a few features that make it difficult to
interact with directly, so read.abi
provides a wrapper around
read.table
to work around these. The three issues are
(1) trailing tab characters, (2) mixed case and punctuation in column
names, and (3) parsing the “Dye/Sample Peak” column.
Because each line of an ABI file contains a trailing tab character
(\t
), read.table
fails to read the file
correctly. read.abi
renames all columns so that
non-alphanumeric characters all become periods, and all uppercase
letters are converted to lower case.
The column Dye/Sample Peak
contains data of the form
<Dye>,<Sample Peak>
, where <Dye>
is a code for the dye
colour used and <Sample Peak>
is an integer indicating the
order of the peaks. Entries where the contents of Dye/Sample
Peak
terminates in a "*"
character (indicating an internal
size standard) are automatically excluded from the analysis.
The final column names are:
sample.file.name
: Name of the file containing data.
size
: Size of the peak (in base pairs).
height
: Height of the peak (arbitrary units).
dye
: Code for dye used.
sample.peak
: Rank of peak within current sample.
In addition, other column names may be retained from ABI output, but not used.
load.abi
, which attempts to construct a
TRAMPsamples
object from an ABI file (with a bit of user
intervention).