docfreq

a <a rd-options="" href="/link/dfm?package=quanteda&version=0.99" data-mini-rdoc="quanteda::dfm">dfm</a>

type of document frequency weighting

scheme

added to the quotient before taking the logarithm

smoothing

added to the denominator in the "inverse" weighting types, to 
prevent a zero document count for a term

the base with respect to which logarithms in the inverse document
frequency weightings are computed; default is 10 (see Manning, 
 Raghavan, and Schutze 2008, p123).

base

numeric value of the threshold above which a feature 
will considered in the computation of document frequency. The default is 
0, meaning that a feature's document frequency will be the number of 
documents in which it occurs greater than zero times.

threshold

logical; if <code>TRUE</code> attach feature labels as names of 
the resulting numeric vector

USE.NAMES

For a <a rd-options="" href="/link/dfm?package=quanteda&version=0.99" data-mini-rdoc="quanteda::dfm">dfm</a> object, returns a (weighted) document frequency for 
each term. The default is a simple count of the number of documents in which
a feature occurs more than a given frequency threshold. (The default 
threshold is zero, meaning that any feature occuring at least once in a 
document will be counted.)

internal

weighting

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and ngrams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Paul Nulty

Adam Obeng

Haiyan Wang

Benjamin Lauderdale

Will Lowe

docfreq function

a <a rd-options='' href='dfm'>dfm</a>

For a <a rd-options='' href='dfm'>dfm</a> object, returns a (weighted) document frequency for 
each term. The default is a simple count of the number of documents in which
a feature occurs more than a given frequency threshold. (The default 
threshold is zero, meaning that any feature occuring at least once in a 
document will be counted.)

docfreq: compute the (weighted) document frequency of a feature

Description

Usage

Arguments

Value

References

Examples