Learn R Programming

fulltext (version 2.0)

as.ft_data: Coerce directory of papers to ft_data object

Description

create the same object that ft_get() outputs from your cached files - without having to run ft_get() again

Usage

as.ft_data(path = NULL)

Arguments

path

cache path. if not given, we use the default cache path. Default: NULL

Value

an object of class ft_data

Details

We use an internal store of identifiers to keep track of files. These identifiers are in the output of ft_get() and you can see them in that output. If a file does not have a matching entry in our index of files (e.g., if you drop a file into the cache location as in the example below), then we assign it an index based on the file path; we'd ideally use an article DOI or similar but we can not safely retrieve it with just a file path.

See Also

ft_get()

Examples

Run this code
# NOT RUN {
# put a file in the cache in case there aren't any
dir <- file.path(tempdir(), "testing")
dir.create(dir)
file <- system.file("examples", "elife.xml", package = "fulltext")
writeLines(readLines(file), tempfile(tmpdir = dir, fileext = ".xml"))

# call as.ft_data
x <- as.ft_data(path = dir)

# output lives underneath a special list index "cached" 
#   representing already present files
x$cached

# }
# NOT RUN {
# collect chunks
if (requireNamespace("pubchunks")) {
  library(pubchunks)
  res <- ft_collect(x)
  pub_chunks(res, c("doi", "title")) %>% pub_tabularize()
}
# }

Run the code above in your browser using DataLab