textmodel_ca: Correspondence analysis of a document-feature matrix
Description
textmodel_ca
implements correspondence analysis scaling on a
dfm. The method is a fast/sparse version of function
ca.
Usage
textmodel_ca(x, smooth = 0, nd = NA, sparse = FALSE, residual_floor = 0.1)
Value
textmodel_ca()
returns a fitted CA textmodel that is a special
class of ca object.
Arguments
- x
the dfm on which the model will be fit
- smooth
a smoothing parameter for word counts; defaults to zero.
- nd
Number of dimensions to be included in output; if NA
(the
default) then the maximum possible dimensions are included.
- sparse
retains the sparsity if set to TRUE
; set it to
TRUE
if x
(the dfm) is too big to be allocated after
converting to dense
- residual_floor
specifies the threshold for the residual matrix for
calculating the truncated svd.Larger value will reduce memory and time cost
but might reduce accuracy; only applicable when sparse = TRUE
Author
Kenneth Benoit and Haiyan Wang
Details
svds in the RSpectra package is applied to
enable the fast computation of the SVD.
References
Nenadic, O. & Greenacre, M. (2007). Correspondence Analysis in R, with Two- and Three-dimensional Graphics:
The ca package. Journal of Statistical Software, 20(3). tools:::Rd_expr_doi("10.18637/jss.v020.i03")
Examples
Run this codelibrary("quanteda")
dfmat <- dfm(tokens(data_corpus_irishbudget2010))
tmod <- textmodel_ca(dfmat)
summary(tmod)
Run the code above in your browser using DataLab