textmodel_ca: Correspondence analysis of a document-feature matrix

Description

textmodel_ca implements correspondence analysis scaling on a dfm. The method is a fast/sparse version of function ca.

Usage

textmodel_ca(x, smooth = 0, nd = NA, sparse = FALSE, residual_floor = 0.1)

Value

textmodel_ca() returns a fitted CA textmodel that is a special class of ca object.

Arguments

x: the dfm on which the model will be fit
smooth: a smoothing parameter for word counts; defaults to zero.
nd: Number of dimensions to be included in output; if NA (the default) then the maximum possible dimensions are included.
sparse: retains the sparsity if set to TRUE; set it to TRUE if x (the dfm) is too big to be allocated after converting to dense
residual_floor: specifies the threshold for the residual matrix for calculating the truncated svd.Larger value will reduce memory and time cost but might reduce accuracy; only applicable when sparse = TRUE

Author

Kenneth Benoit and Haiyan Wang

Details

svds in the RSpectra package is applied to enable the fast computation of the SVD.

References

Nenadic, O. & Greenacre, M. (2007). Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca package. Journal of Statistical Software, 20(3). tools:::Rd_expr_doi("10.18637/jss.v020.i03")

Examples

Run this code

library("quanteda")
dfmat <- dfm(tokens(data_corpus_irishbudget2010))
tmod <- textmodel_ca(dfmat)
summary(tmod)

Run the code above in your browser using DataLab