Learn R Programming

taxonomizr (version 0.10.6)

read.accession2taxid: Read NCBI accession2taxid files

Description

Take NCBI accession2taxid files, keep only accession and taxa and save it as a SQLite database

Usage

read.accession2taxid(
  taxaFiles,
  sqlFile,
  vocal = TRUE,
  extraSqlCommand = "",
  indexTaxa = FALSE,
  overwrite = FALSE
)

Value

TRUE if sucessful

Arguments

taxaFiles

a string or vector of strings giving the path(s) to files to be read in

sqlFile

a string giving the path where the output SQLite file should be saved

vocal

if TRUE output status messages

extraSqlCommand

for advanced use. A string giving a command to be called on the SQLite database before loading data. A couple potential uses:

  • "pragma temp_store = 2;" to keep all SQLite temp files in memory. Don't do this unless you have a lot (>100 Gb) of RAM

indexTaxa

if TRUE add an index for taxa ID. This would only be necessary if you want to look up accessions by taxa ID e.g. getAccessions

overwrite

If TRUE, delete accessionTaxa table in database if present and regenerate

References

https://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/

See Also

read.nodes.sql, read.names.sql

Examples

Run this code
taxa<-c(
  "accession\taccession.version\ttaxid\tgi",
  "Z17427\tZ17427.1\t3702\t16569",
  "Z17428\tZ17428.1\t3702\t16570",
  "Z17429\tZ17429.1\t3702\t16571",
  "Z17430\tZ17430.1\t3702\t16572"
)
inFile<-tempfile()
sqlFile<-tempfile()
writeLines(taxa,inFile)
read.accession2taxid(inFile,sqlFile,vocal=FALSE)
db<-RSQLite::dbConnect(RSQLite::SQLite(),dbname=sqlFile)
RSQLite::dbGetQuery(db,'SELECT * FROM accessionTaxa')
RSQLite::dbDisconnect(db)

Run the code above in your browser using DataLab