Learn R Programming

qdap (version 0.2.5)

stopwords: Remove Stopwords

Description

Transcript apply the removal of stopwords.

Usage

stopwords(textString, stopwords = Top25Words,
    unlist = FALSE, separate = TRUE, strip = FALSE,
    unique = FALSE, char.keep = NULL, names = FALSE,
    ignore.case = TRUE, apostrophe.remove = FALSE, ...)

Arguments

textString
A character string of text or a vector of character strings.
stopwords
A character vector of words to remove from the text. qdap has a number of data sets that can be used as stopwords including: Top200Words, Top100Words, Top25Words. For the tm package's traditional English stop words use tm::stopwords("e
unlist
logical. If TRUE unlists into one vector. General use intended for when separate is FALSE.
separate
logical. If TRUE separates sentences into words. If FALSE retains sentences.
strip
logical. IF TRUE strips the text of all punctuation except apostrophes.
unique
logical. If TRUE keeps only unique words (if unlist is TRUE) or sentences (if unlist is FALSE). General use intended for when unlist is TRUE.
char.keep
If strip is TRUE this argument provides a means of retaining supplied character(s).
names
logical. If TRUE will name the elements of the vector or list with the original textString.
ignore.case
logical. If TRUE stop words will be removed regardless of case. Additionally, case will be stripped from the text. If FALSE stopwords removal is contingent upon case. Additionally, case is not stripped.
apostrophe.remove
logical. If TRUE removes apostrophe's from the output.
...
further arguments passed to strip function

Value

  • Returns a vector of sentences, vector of words, or (default) a list of vectors of words with stop words removed. Output depends on supplied arguments.

See Also

strip, bag.o.words, stopwords

Examples

Run this code
stopwords(DATA$state)
stopwords(DATA$state, tm::stopwords("english"))
stopwords(DATA$state, Top200Words)
stopwords(DATA$state, Top200Words, strip = TRUE)
stopwords(DATA$state, Top200Words, separate = FALSE)
stopwords(DATA$state, Top200Words, separate = FALSE, ignore.case = FALSE)
stopwords(DATA$state, Top200Words, unlist = TRUE)
stopwords(DATA$state, Top200Words, unlist = TRUE, strip=TRUE)
stopwords(DATA$state, Top200Words, unlist = TRUE, unique = TRUE)

Run the code above in your browser using DataLab