Learn R Programming

quanteda (version 0.9.7-17)

tf: compute (weighted) term frequency from a dfm

Description

Apply varieties of term frequency weightings to a dfm.

Usage

tf(x, scheme = c("count", "prop", "propmax", "boolean", "log", "augmented", "logave"), base = 10, K = 0.5)
"tf"(x, scheme = c("count", "prop", "propmax", "boolean", "log", "augmented", "logave"), base = 10, K = 0.5)

Arguments

x
object for which idf or tf-idf will be computed (a document-feature matrix)
scheme
divisor for the normalization of feature frequencies by document. Valid types include:
unity
default, each feature count will remain as feature counts, equivalent to dividing by 1

total
total number of features per document, so that the sum of the normalized feature values is 1.0

maxCount
maximum feature count per document

base
base for the logarithm when scheme is "log" or logave
K
the K for the augmentation when scheme = "augmented"

Value

A document feature matrix to which the weighting scheme has been applied.

Details

tf is a shortcut to compute relative term frequecies (identical to weight(x, "relFreq")).

References

Manning, C. D., Raghavan, P., & Schutze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.

https://en.wikipedia.org/wiki/Tf-idf#Term_frequency_2