- name
(character
of length 1) The taxon to download a sample of
sequences for.
- id
(character
of length 1) The taxon id to download a sample of
sequences for.
- target_rank
(character
of length 1) The finest taxonomic rank
at which to sample. The finest rank at which replication occurs. Must be a
finer rank than taxon
.
- min_counts
(named numeric
) The minimum number of sequences to
download for each taxonomic rank. The names correspond to taxonomic ranks.
- max_counts
(named numeric
) The maximum number of sequences to
download for each taxonomic rank. The names correspond to taxonomic ranks.
- interpolate_min
(logical
) If TRUE
, values supplied to
min_counts
and min_children
will be used to infer the values
of intermediate ranks not specified. Linear interpolation between values of
specified ranks will be used to determine values of unspecified ranks.
- interpolate_max
(logical
) If TRUE
, values supplied to
max_counts
and max_children
will be used to infer the values
of intermediate ranks not specified. Linear interpolation between values of
specified ranks will be used to determine values of unspecified ranks.
- min_children
(named numeric
) The minimum number sub-taxa of
taxa for a given rank must have for its sequences to be searched. The names
correspond to taxonomic ranks.
- max_children
(named numeric
) The maximum number sub-taxa of
taxa for a given rank must have for its sequences to be searched. The names
correspond to taxonomic ranks.
- seqrange
(character) Sequence range, as e.g., "1:1000". This is the
range of sequence lengths to search for. So "1:1000" means search for
sequences from 1 to 1000 characters in length.
- getrelated
(logical) If TRUE, gets the longest sequences of a species
in the same genus as the one searched for. If FALSE, returns nothing if no
match found.
- fuzzy
(logical) Whether to do fuzzy taxonomic ID search or exact
search. If TRUE
, we use xXarbitraryXx[porgn:__txid<ID>]
, but
if FALSE
, we use txid<ID>
. Default: FALSE
- limit
(numeric
) Number of sequences to search for and return.
Max of 10,000. If you search for 6000 records, and only 5000 are found, you
will of course only get 5000 back.
- entrez_query
(character
; length 1) An Entrez-format query to
filter results with. This is useful to search for sequences with specific
characteristics. The format is the same as the one used to seach genbank.
(https://www.ncbi.nlm.nih.gov/books/NBK3837/#EntrezHelp.Entrez_Searching_Options)
- hypothetical
(logical
; length 1) If FALSE
, an attempt
will be made to not return hypothetical or predicted sequences judging from
accession number prefixs (XM and XR). This can result in less than the
limit
being returned even if there are more sequences available,
since this filtering is done after searching NCBI.
- verbose
(logical
) If TRUE
, progress messages will be
printed.