The Entrez programming utilities is a toolset for automatic download of data from the
NCBI databases, see E-utilities Quick Start
for details. entrezDownload
can be used to download genomes from the NCBI Nucleotide
database through these utilities.
The argument accession must be a set of valid accession numbers at NCBI Nucleotide, typically
all accession numbers related to a genome (chromosomes, plasmids, contigs, etc). For completed genomes,
where the number of sequences is low, accession is typically a single text listing all accession
numbers separated by commas. In the case of some draft genomes having a large number of contigs, the
accession numbers must be split into several comma-separated texts. The reason for this is that Entrez
will not accept too many queries in one chunk (less than 500).
The downloaded sequences are saved in file on your system. This will be a FASTA formatted file,
and should by convention have the filename extension .fsa. Note that all downloaded sequences end
up in this file. If you want to download multiple genomes, you call entrezDownload
multiple
times.