Learn R Programming

qdap (version 1.3.5)

freq_terms: Find Frequent Terms

Description

Find the most frequently occurring terms in a text vector.

Usage

freq_terms(text.var, top = 20, at.least = 1, stopwords = NULL,
  extend = TRUE, ...)

Arguments

text.var
The text variable.
top
Top number of terms to show.
at.least
An integer indicating at least how many letters a word must be to be included in the output.
stopwords
A character vector of words to remove from the text. qdap has a number of data sets that can be used as stop words including: Top200Words, Top100Words, Top25Words. For the tm package's traditional Engli
extend
logical. If TRUE the top argument is extended to any word that has the same frequency as the top word.
...
Other arguments passed to all_words.

Value

  • Returns a dataframe with the top occurring words.

See Also

word_list, all_words

Examples

Run this code
freq_terms(DATA$state, 5)
freq_terms(DATA$state)
freq_terms(DATA$state, extend = FALSE)
freq_terms(DATA$state, at.least = 4)
(out <- freq_terms(pres_debates2012$dialogue, stopwords = Top200Words))
plot(out)

## All words by sentence (row)
x <- raj$dialogue
list_df2df(setNames(lapply(x, freq_terms, top=Inf), seq_along(x)), "row")
list_df2df(setNames(lapply(x, freq_terms, top=10, stopwords = Dolch),
    seq_along(x)), "Title")


## All words by person
FUN <- function(x, n=Inf) freq_terms(paste(x, collapse=" "), top=n)
list_df2df(lapply(split(x, raj$person), FUN), "person")

## Plot it
out <- lapply(split(x, raj$person), FUN, n=10)
pdf("Freq Terms by Person.pdf", width=13)
lapply(seq_along(out), function(i) {
    ## dev.new()
    plot(out[[i]], plot=FALSE) + ggtitle(names(out)[i])
})
dev.off()

Run the code above in your browser using DataLab