When a filename is provided, the function will return a vector of terms. If nothing is provided,
it will return the stop words used in package jiebaR
. See Details.
make_stoplist(x = "jiebar", print = TRUE)
a length 1 character specifying a valid stop word file.
If it is not provided, or
is "jiebar" (default), "jiebaR" or "auto", it will return part of the stop words used by package
jiebaR
.
See Details.
TRUE
or FALSE
, whether to print the first 5 words
a character vector of words. If no word is obtained, it will return NULL
.
In a valid text file that saves stop words, each word should occupy a single line. However, if any line that contains more than one word and these words are separated by blanks, punctuations, numbers, it is also accepted, for the function will try to split them. Duplicated words will also be automatically removed. The encoding of a stop words file is auto-detected by the function.
For stop word list from jiebaR
, see jiebaR::STOPPATH
. It contains
many words that are often removed in analyzing Chinese text.
However, the result returned by make_stoplist
is slightly different.