findMostFreqTerms: Find Most Frequent Terms

Description

Find most frequent terms in a document-term or term-document matrix, or a vector of term frequencies.

Usage

findMostFreqTerms(x, n = 6L, ...)
# S3 method for DocumentTermMatrix
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)
# S3 method for TermDocumentMatrix
findMostFreqTerms(x, n = 6L, INDEX = NULL, ...)

Value

For the document-term or term-document matrix methods, a list with the named frequencies of the up to n most frequent terms occurring in each document (group). Otherwise, a single such vector of most frequent terms.

Arguments

x: A DocumentTermMatrix or TermDocumentMatrix, or a vector of term frequencies as obtained by termFreq().
n: A single integer giving the maximal number of terms.
INDEX: an object specifying a grouping of documents for rollup, or NULL (default) in which case each document is considered individually.
...: arguments to be passed to or from methods.

Details

Only terms with positive frequencies are included in the results.

Examples

Run this code

data("crude")

## Term frequencies:
tf <- termFreq(crude[[14L]])
findMostFreqTerms(tf)

## Document-term matrices:
dtm <- DocumentTermMatrix(crude)
## Most frequent terms for each document:
findMostFreqTerms(dtm)
## Most frequent terms for the first 10 the second 10 documents,
## respectively:
findMostFreqTerms(dtm, INDEX = rep(1 : 2, each = 10L))

Run the code above in your browser using DataLab