Learn R Programming

seqinr (version 3.1-2)

read.alignment: Read aligned sequence files in mase, clustal, phylip, fasta or msf format

Description

Read a file in mase, clustal, phylip, fasta or msf format. These formats are used to store nucleotide or protein multiple alignments.

Usage

read.alignment(file, format, forceToLower = TRUE)

Arguments

file
the name of the file which the aligned sequences are to be read from. If it does not contain an absolute or relative path, the file name is relative to the current working directory, getwd.
format
a character string specifying the format of the file : mase, clustal, phylip, fasta or msf
forceToLower
a logical defaulting to TRUE stating whether the returned characters in the sequence should be in lower case (introduced in seqinR release 1.1-3).

Value

  • An object of class alignment which is a list with the following components:
  • nbthe number of aligned sequences
  • nama vector of strings containing the names of the aligned sequences
  • seqa vector of strings containing the aligned sequences
  • coma vector of strings containing the commentaries for each sequence or NA if there are no comments

Details

[object Object],[object Object],[object Object],[object Object],[object Object]

References

citation("seqinr")

See Also

To read aligned sequences in NEXUS format, see the function read.nexus that was available in the CompPairWise package (not sure it is still maintained as of 09/09/09). The NEXUS format was mainly used by the non-GPL commercial PAUP software.

Related functions: as.matrix.alignment, read.fasta, write.fasta, reverse.align, dist.alignment.

Examples

Run this code
mase.res   <- read.alignment(file = system.file("sequences/test.mase", package = "seqinr"),
 format = "mase")
clustal.res <- read.alignment(file = system.file("sequences/test.aln", package = "seqinr"),
 format="clustal")
phylip.res  <- read.alignment(file = system.file("sequences/test.phylip", package = "seqinr"),
 format = "phylip")
msf.res      <- read.alignment(file = system.file("sequences/test.msf", package = "seqinr"),
 format = "msf")
fasta.res    <- read.alignment(file = system.file("sequences/Anouk.fasta", package = "seqinr"),
 format = "fasta")

#
# Quality control routine sanity checks:
#

data(mase); stopifnot(identical(mase, mase.res))
data(clustal); stopifnot(identical(clustal, clustal.res))
data(phylip); stopifnot(identical(phylip, phylip.res))
data(msf); stopifnot(identical(msf, msf.res))
data(fasta); stopifnot(identical(fasta, fasta.res))

Run the code above in your browser using DataLab