Learn R Programming

bio3d (version 2.2-4)

get.seq: Download FASTA Sequence Files

Description

Downloads FASTA sequence files from the NR, or SWISSPROT/UNIPROT databases.

Usage

get.seq(ids, outfile = "seqs.fasta", db = "nr")

Arguments

ids
A character vector of one or more appropriate database codes/identifiers of the files to be downloaded.
outfile
A single element character vector specifying the name of the local file to which sequences will be written.
db
A single element character vector specifying the database from which sequences are to be obtained.

Value

If all files are successfully downloaded a list object with two components is returned:
ali
an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.
ids
sequence names as identifiers.
This is similar to that returned by read.fasta. However, if some files were not successfully downloaded then a vector detailing which ids were not found is returned.

Details

This is a basic function to automate sequence file download from the NR and SWISSPROT/UNIPROT databases.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

See Also

blast.pdb, read.fasta, read.fasta.pdb, get.pdb

Examples

Run this code
## Not run: 
# ## Sequence identifiers (GI or PDB codes e.g. from blast.pdb etc.)
# get.seq( c("P01112", "Q61411", "P20171") )
# 
# #aa <-get.seq( c("4q21", "5p21") )
# #aa$id
# #aa$ali
# ## End(Not run)

Run the code above in your browser using DataLab