Last chance! 50% off unlimited learning
Sale ends in
Create volatile corpora.
VCorpus(x, readerControl = list(reader = reader(x), language = "en"))
as.VCorpus(x)
An object inheriting from VCorpus
and Corpus
.
For VCorpus
a Source
object, and for
as.VCorpus
an R object.
a named list of control parameters for reading in content
from x
.
reader
a function capable of reading in and processing the
format delivered by x
.
language
a character giving the language (preferably as
IETF language tags, see language in
package NLP).
The default language is assumed to be English ("en"
).
A volatile corpus is fully kept in memory and thus all changes only affect the corresponding R object.
Corpus
for basic information on the corpus infrastructure
employed by package tm.
PCorpus
provides an implementation with permanent storage
semantics.
reut21578 <- system.file("texts", "crude", package = "tm")
VCorpus(DirSource(reut21578, mode = "binary"),
list(reader = readReut21578XMLasPlain))
Run the code above in your browser using DataLab