textmodel_ca: correspondence analysis of a document-feature matrix
Description
textmodel_ca
implements correspondence analysis scaling on a
dfm. The method is a fast/sparse version of function ca, and
returns a special class of ca object.
Usage
textmodel_ca(x, smooth = 0, nd = NA, sparse = FALSE, threads = 1,
residual_floor = 0.1)
Arguments
x
the dfm on which the model will be fit
smooth
a smoothing parameter for word counts; defaults to zero.
nd
Number of dimensions to be included in output; if NA
(the
default) then the maximum possible dimensions are included.
sparse
retains the sparsity if set to TRUE
; set it to
TRUE
if x
(the dfm) is too big to be allocated after
converting to dense
threads
the number of threads to be used; set to 1 to use a
serial version of the function; only applicable when sparse = TRUE
residual_floor
specifies the threshold for the residual matrix for
calculating the truncated svd.Larger value will reduce memory and time cost
but might sacrify the accuracy; only applicable when sparse = TRUE
Details
svds in the RSpectra package is applied to
enable the fast computation of the SVD.
References
Nenadic, O. and Greenacre, M. (2007). Correspondence analysis in
R, with two- and three-dimensional graphics: The ca package. Journal
of Statistical Software, 20 (3), http://www.jstatsoft.org/v20/i03/.
Examples
Run this code# NOT RUN {
ieDfm <- dfm(data_corpus_irishbudget2010)
wca <- textmodel_ca(ieDfm)
summary(wca)
# }
Run the code above in your browser using DataLab