dfm
class, depending on whether the object can be
represented by a sparse matrix, in which case it is a dfmSparse
class object, or if dense, then a dfmDense
object. See Details.
"t"(x)
"t"(x)
"colSums"(x, na.rm = FALSE, dims = 1L, ...)
"rowSums"(x, na.rm = FALSE, dims = 1L, ...)
"colMeans"(x, na.rm = FALSE, dims = 1L, ...)
"rowMeans"(x, na.rm = FALSE, dims = 1L, ...)
"["(x, i = NULL, j = NULL, ..., drop = FALSE)
"["(x, i = NULL, j = NULL, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i = NULL, j = NULL, ..., drop = FALSE)
"["(x, i = NULL, j = NULL, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"["(x, i, j, ..., drop = FALSE)
"+"(e1, e2)
"+"(e1, e2)
"+"(e1, e2)
"+"(e1, e2)
"as.matrix"(x)
TRUE
, omit missing values (including NaN
) from
the calculationsFALSE
settings
settings
.weighting
"frequency"
, indicating that the values in the cells of the dfm are
simple feature counts. To change this, use the weight
method.smooth
smooth
or the weight
methods.Dimnames
docs
and features
respectively.dfm
class is a virtual class that will contain one of two
subclasses for containing the cell counts of document-feature matrixes:
dfmSparse
or dfmDense
.The dfmSparse
class is a sparse matrix version of
dfm-class
, inheriting dgCMatrix-class from the
Matrix package. It is the default object type created when feature
counts are the object of interest, as typical text-based feature counts
tend contain many zeroes. As long as subsequent transformations of the dfm
preserve cells with zero counts, the dfm should remain sparse.
When the Matrix package implements sparse integer matrixes, we will
switch the default object class to this object type, as integers are 4
bytes each (compared to the current numeric double type requiring 8 bytes
per cell.)
The dfmDense
class is a sparse matrix version of dfm-class
,
inheriting dgeMatrix-class from the Matrix package. dfm objects that
are converted through weighting or other transformations into cells without zeroes will
be automatically converted to the dfmDense class. This will necessarily be a much larger sized
object than one of dfmSparse
class, because each cell is recorded as a numeric (double) type
requiring 8 bytes of storage.
# coercion to matrix
dfmSparse <- dfm(inaugTexts, verbose = FALSE)
str(as.matrix(dfmSparse))
Run the code above in your browser using DataLab