remove_infrequent_terms

dfm_object

proportion of documents a term must be included in
to be included in the dfm.

proportion_threshold

Defaults to NULL. If not NULL, then it must be a numeric
vector specifying the column indices of terms the user would like to remove.
Useful for removing specific terms.

indices

Logical indicating whether more information should be printed
to the screen to let the user know about progress in preprocessing. Defaults
to TRUE.

verbose

Removes terms appearing in less than a specific proportion of
documents in a corpus from a dfm.

Functions to assess the effects of different text preprocessing decisions on the inferences drawn from the resulting document-term matrices they generate.

remove_infrequent_terms: Remove infrequently occurring terms from quanteda dfm.

Description

Usage

Arguments

Value

Examples