read.fasta(file, header = FALSE, sep = "", quote = "\"", dec = ".", fill
= FALSE, alphabet = aabet)
getwd()
.
Tilde-expansion is performed where supported. As from R
2.10.0 this can be a compressed file (see file
). Alternatively, file
can be a readable text-mode
connection
(which will be opened for reading if necessary,
and if so close
d (and hence destroyed) at the end of the
function call). (If stdin()
is used, the prompts for lines
may be somewhat confusing. Terminate input with a blank line
or an EOF signal, Ctrl-D
on Unix and Ctrl-Z
on Windows.
Any pushback on stdin()
will be cleared before return.)
file
can also be a complete URL.
header
is set to
TRUE
if and only if the first row contains one fewer field
than the number of columns.
sep = ""
(the
default for read.table
) the separator is white space
,
that is one or more spaces, tabs, newlines or carriage
returns.
quote = ""
. See scan
for the behavior on quotes
embedded in quotes. Quoting is only considered for columns
read as character, which is all of them unless colClasses
is specified.
TRUE
then in case the rows have unequal length,
blank fields are implicitly added.
-
, where X is
an unspecified residue and -
is a gap.
Sequences
. This is a small
extension of the matrix class, and as expected, each row of the matrix
corresponds to a single sequence. The sequences are always represented
as integers. The rownames of the matrix are the original string/character
representations of the sequences.
read.table
for more information about
reading the file itself. Information about the FASTA form may be found
elsewhere, but basically each sequence starts with a definition/name
deliminated by a '<' character.="" for="" example:="" ----------------------="">Sequence 1, from mouse
FTRP
>Sequence 2b, from humans
FPYT
>Unkown origin
FPRW
-----------------------
Each sequence should be the same length, thus
-
should be use to pad the sequences, as seen in the example. Use an alignment
algorithm, such as Clustal, to align your sequences before reading. The
ClustalW2 algorithm is available from the European Bioinformatics
Institutes's website.
'>Sequences
, read.sequences