Learn R Programming

quanteda (version 0.99)

convert-wrappers: convenience wrappers for dfm convert

Description

To make the usage as consistent as possible with other packages, quanteda also provides shortcut wrappers to convert, designed to be similar in syntax to analagous commands in the packages to whose format they are converting.

Usage

as.wfm(x)

as.DocumentTermMatrix(x, ...)

dfm2ldaformat(x)

quantedaformat2dtm(x)

Arguments

x

the dfm to be converted

...

additional arguments used only by as.DocumentTermMatrix

Value

A converted object determined by the value of to (see above). See conversion target package documentation for more detailed descriptions of the return formats.

Details

as.wfm converts a quanteda dfm into the wfm format used by the austin package.

as.DocumentTermMatrix will convert a quanteda dfm into the tm package's DocumentTermMatrix format. Note: The tm package version of as.TermDocumentMatrix allows a weighting argument, which supplies a weighting function for TermDocumentMatrix. Here the default is for term frequency weighting. If you want a different weighting, apply the weights after converting using one of the tm functions. For other available weighting functions from the tm package, see TermDocumentMatrix.

dfm2ldaformat provides converts a dfm into the list representation of terms in documents used by tghe lda package (a list with components "documents" and "vocab" as needed by lda.collapsed.gibbs.sampler).

quantedaformat2dtm provides converts a dfm into the sparse simple triplet matrix representation of terms in documents used by the topicmodels package.

Examples

Run this code
# NOT RUN {
mycorpus <- corpus_subset(data_corpus_inaugural, Year > 1970)
quantdfm <- dfm(mycorpus, verbose = FALSE)

# shortcut conversion to austin package's wfm format
identical(as.wfm(quantdfm), convert(quantdfm, to = "austin"))

# }
# NOT RUN {
# shortcut conversion to tm package's DocumentTermMatrix format
identical(as.DocumentTermMatrix(quantdfm), convert(quantdfm, to = "tm"))
# }
# NOT RUN {
# }
# NOT RUN {
# shortcut conversion to lda package list format
identical(dfm2ldaformat(quantdfm), convert(quantdfm, to = "lda")) 
# }
# NOT RUN {
# shortcut conversion to topicmodels package format
# }
# NOT RUN {
identical(quantedaformat2dtm(quantdfm), 
          convert(quantdfm, to = "topicmodels")) 
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab