Downloads FASTA sequence files from the NR, or SWISSPROT/UNIPROT
databases.
Usage
get.seq(ids, outfile = "seqs.fasta", db = "nr", verbose = FALSE)
Arguments
ids
A character vector of one or more appropriate database
codes/identifiers of the files to be downloaded.
outfile
A single element character vector specifying the name
of the local file to which sequences will be written.
db
A single element character vector specifying the database
from which sequences are to be obtained.
verbose
logical, if TRUE URL details of the download process
are printed, else a progress bar is displayed.
Value
If all files are successfully downloaded a list object with two
components is returned:
ali
an alignment character matrix with a row per sequence and
a column per equivalent aminoacid/nucleotide.
ids
sequence names as identifiers.
This is similar to that returned by read.fasta. However,
if some files were not successfully downloaded then a vector detailing
which ids were not found is returned.
Details
This is a basic function to automate sequence file download from the
NR and SWISSPROT/UNIPROT databases.
References
Grant, B.J. et al. (2006) Bioinformatics22, 2695--2696.
# NOT RUN {## Sequence identifiers (GI or PDB codes e.g. from blast.pdb etc.)get.seq( c("P01112", "Q61411", "P20171") )
#aa <-get.seq( c("4q21", "5p21") )#aa$id#aa$ali# }