cast_tdm

cast_dtm

cast_dfm

This turns a "tidy" one-term-per-document-per-row data frame into a
DocumentTermMatrix or TermDocumentMatrix from the tm package, or a
dfm from the quanteda package. These functions support non-standard
evaluation through the tidyeval framework. Groups are ignored.

Using tidy data principles can make many text mining tasks
easier, more effective, and consistent with tools already in wide use.
Much of the infrastructure needed for text mining with tidy data
frames already exists in packages like 'dplyr', 'broom', 'tidyr', and
'ggplot2'. In this package, we provide functions and supporting data
sets to allow conversion of text to and from tidy formats, and to
switch seamlessly between tidy tools and existing text mining
packages.

Julia Silge

tidytext

Text Mining using 'dplyr', 'ggplot2', and Other Tidy Tools

Gabriela De Queiroz

Colin Fay

Emil Hvitfeldt

Os Keyes

Kanishka Misra

Tim Mastny

Jeff Erickson

David Robinson

cast_tdm function

<dl><dt>data</dt>
<dd>Table with one-term-per-document-per-row</dd>
<dt>term</dt>
<dd>Column containing terms as string or symbol</dd>
<dt>document</dt>
<dd>Column containing document IDs as string or symbol</dd>
<dt>value</dt>
<dd>Column containing values as string or symbol</dd>
<dt>weighting</dt>
<dd>The weighting function for the DTM/TDM
(default is term-frequency, effectively unweighted)</dd>
<dt>...</dt>
<dd>Extra arguments passed on to
<code>sparseMatrix()</code></dd></dl>

Arguments

Casting a data frame to
a DocumentTermMatrix, TermDocumentMatrix, or dfm — cast_tdm

<dl>

<dt>data</dt>
<dd>Table with one-term-per-document-per-row</dd>


<dt>term</dt>
<dd>Column containing terms as string or symbol</dd>


<dt>document</dt>
<dd>Column containing document IDs as string or symbol</dd>


<dt>value</dt>
<dd>Column containing values as string or symbol</dd>


<dt>weighting</dt>
<dd>The weighting function for the DTM/TDM
(default is term-frequency, effectively unweighted)</dd>


<dt>...</dt>
<dd>Extra arguments passed on to
<code>sparseMatrix()</code></dd>

</dl>

Casting a data frame to
a DocumentTermMatrix, TermDocumentMatrix, or dfm

cast_tdm: Casting a data frame to a DocumentTermMatrix, TermDocumentMatrix, or dfm

Description

Usage

Arguments

Details