Learn R Programming

bio3d (version 2.4-4)

get.seq: Download FASTA Sequence Files

Description

Downloads FASTA sequence files from the NCBI nr, SWISSPROT/UNIPROT, OR RCSB PDB databases.

Usage

get.seq(ids, outfile = "seqs.fasta", db = "nr", verbose = FALSE)

Value

If all files are successfully downloaded a list object with two components is returned:

ali

an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.

ids

sequence names as identifiers.

This is similar to that returned by read.fasta. However, if some files were not successfully downloaded then a vector detailing which ids were not found is returned.

Arguments

ids

A character vector of one or more appropriate database codes/identifiers of the files to be downloaded.

outfile

A single element character vector specifying the name of the local file to which sequences will be written.

db

A single element character vector specifying the database from which sequences are to be obtained.

verbose

logical, if TRUE URL details of the download process are printed.

Author

Barry Grant

Details

This is a basic function to automate sequence file download from the databases including NCBI nr, SWISSPROT/UNIPROT, and RCSB PDB.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

See Also

blast.pdb, read.fasta, read.fasta.pdb, get.pdb

Examples

Run this code
if (FALSE) {
## Sequence identifiers (GI or PDB codes e.g. from blast.pdb etc.)
get.seq( c("P01112", "Q61411", "P20171") )

#aa <-get.seq( c("4q21", "5p21") )
#aa$id
#aa$ali
}

Run the code above in your browser using DataLab