This method strips off defined word classes of tagged text objects.
filterByClass(txt, ...)# S4 method for kRp.text
filterByClass(
txt,
corp.rm.class = "nonpunct",
corp.rm.tag = c(),
as.vector = FALSE,
update.desc = TRUE
)
An object of class kRp.text
.
Additional options, currently unused.
A character vector with word classes which should be removed. The default value
"nonpunct"
has special meaning and will cause the result of
kRp.POS.tags(lang, tags=c("punct","sentc"), list.classes=TRUE)
to be used.
Another valid value is "stopword" to remove all detected stopwords.
A character vector with valid POS tags which should be removed.
Logical. If TRUE
,
results will be returned as a character vector containing only the text parts
which survived the filtering.
Logical. If TRUE
,
the desc
slot of the tagged object will be fully recalculated
using the filtered text. If FALSE
,
the desc
slot will be copied from the original object.
Finally, if NULL
, the desc
slot remains empty.
An object of the input class. If as.vector=TRUE
, returns only a character vector.
# NOT RUN {
filterByClass(tagged.text)
# }
Run the code above in your browser using DataLab