Learn R Programming

quanteda (version 4.2.0)

dfm-class: Virtual class "dfm" for a document-feature matrix

Description

The dfm class of object is a type of Matrix-class object with additional slots, described below. quanteda uses two subclasses of the dfm class, depending on whether the object can be represented by a sparse matrix, in which case it is a dfm class object, or if dense, then a dfmDense object. See Details.

Usage

# S4 method for dfm
t(x)

# S4 method for dfm colSums(x, na.rm = FALSE, dims = 1, ...)

# S4 method for dfm rowSums(x, na.rm = FALSE, dims = 1, ...)

# S4 method for dfm colMeans(x, na.rm = FALSE, dims = 1, ...)

# S4 method for dfm rowMeans(x, na.rm = FALSE, dims = 1, ...)

# S4 method for dfm,numeric Arith(e1, e2)

# S4 method for numeric,dfm Arith(e1, e2)

# S4 method for dfm,index,index,missing [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,index,index,logical [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,missing,missing,missing [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,missing,missing,logical [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,index,missing,missing [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,index,missing,logical [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,missing,index,missing [(x, i, j, ..., drop = TRUE)

# S4 method for dfm,missing,index,logical [(x, i, j, ..., drop = TRUE)

Arguments

x

the dfm object

na.rm

if TRUE, omit missing values (including NaN) from the calculations

dims

ignored

...

additional arguments not used here

e1

first quantity in an Arith operation for dfm

e2

second quantity in an Arith operation for dfm

i

document names or indices for documents to extract.

j

feature names or indices for documents to extract.

Slots

weightTf

the type of term frequency weighting applied to the dfm. Default is "frequency", indicating that the values in the cells of the dfm are simple feature counts. To change this, use the dfm_weight() method.

weightFf

the type of document frequency weighting applied to the dfm. See docfreq().

smooth

a smoothing parameter, defaults to zero. Can be changed using the dfm_smooth() method.

Dimnames

These are inherited from Matrix-class but are named docs and features respectively.

Details

The dfm class is a virtual class that will contain dgCMatrix-class.

See Also

dfm

Examples

Run this code
# dfm subsetting
dfmat <- dfm(tokens(c("this contains lots of stopwords",
                  "no if, and, or but about it: lots",
                  "and a third document is it"),
                remove_punct = TRUE))
dfmat[1:2, ]
dfmat[1:2, 1:5]

Run the code above in your browser using DataLab