Learn R Programming

tm.plugin.dc (version 0.2-10)

Revisions: Revisions of a Distributed Corpus

Description

Each modification of the documents in the corpus results in a new stage, i.e., revision of the corpus. To allow fast switching between multiple revisions all modifications may be kept on the file system. The function setRevision() allows to go back to any stage in the history of the corpus. The function keepRevisions() shows if revisions are turned on or off; the corresponding replacement function is used to set the desired behavior.

Usage

getRevisions( corpus )
removeRevision( corpus, revision )
setRevision( corpus, revision )
keepRevisions( corpus )
`keepRevisions<-`( corpus, value )

Arguments

corpus

A distributed corpus of class DCorpus.

revision

The revision which is to be set as active or removed.

value

A logical indicating whether revisions should be kept or not.

Value

Whereas getRevisions() returns a list of character strings naming all available revisions, setRevision() returns the distributed corpus with the given revision marked as active. The function keepRevisions() returns a logical indicating whether revisions are used or not.

Examples

Run this code
# NOT RUN {
## provide data on storage
data("crude")
dc <- as.DCorpus(crude)
## do some preprocessing
dc <- tm_map(dc, content_transformer(tolower))
## retrieve available revisions
revs <- getRevisions(dc)
revs
## go back to original revision
setRevision(dc, revs[2])
keepRevisions(dc)
keepRevisions(dc) <- FALSE
# }

Run the code above in your browser using DataLab