hyphen(words, hyph.pattern = NULL, min.length = 3,
rm.hyph = TRUE, corp.rm.class = "nonpunct",
corp.rm.tag = c(), quiet = FALSE, cache = TRUE)
kRp.tagged-class
,
kRp.txt.freq-class
or
kRp.hyph.pat-class
, or a valid
character string naming the language of the patterns to
be used. See details."nonpunct"
has special meaning and will cause the
result of kRp.POS.tags(lang, c("punct","sentc"),
list.classes=TRUE)
to be used. Relevant onwords
is a
valid koRpus object.FALSE
, short status
messages will be shown.hyphen()
can cache results
to speed up the process. If this option is set to
TRUE
, the current cache will be queried and new
tokens also be added. Caches are language-specific and
reside in an environment, i.e., kRp.hyphen-class
hyph.XX
words
is already a tagged object, its language
definition might be used. Otherwise, in addition to the
words to be processed you must specify
hyph.pattern
. You have two options: If you want to
use one of the built-in language patterns, just set it to
the according language abbrevation. As of this version
valid choices are: "de"
"de.old"
"en"
"en.us"
"es"
"fr"
"it"
"ru"
[1]
[2]
read.hyph.pat
,
manage.hyph.pat