tokens_recompile

This function recompiles a serialized tokens object when the vocabulary has
been changed in a way that makes some of its types identical, such as
lowercasing when a lowercased version of the type already exists in the type
table, or introduces gaps in the integer map of the types. It also
re-indexes the types attribute to account for types that may have become
duplicates, through a procedure such as stemming or lowercasing; or the
addition of new tokens through compounding.

internal

tokens

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and n-grams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Haiyan Wang

Paul Nulty

Adam Obeng

Stefan Müller

Akitaka Matsuo

William Lowe

Christian Müller

Olivier Delmarcelle

European Research Council 

tokens_recompile function

<dl><dt>x</dt>
<dd>the tokens object to be recompiled</dd>
<dt>method</dt>
<dd><code>"C++"</code> for C++ implementation or <code>"R"</code> for an older
R-based method</dd>
<dt>gap</dt>
<dd>if <code>TRUE</code>, remove gaps between token IDs</dd>
<dt>dup</dt>
<dd>if <code>TRUE</code>, merge duplicated token types into the same ID</dd></dl>

Arguments

recompile a serialized tokens object — tokens_recompile

<dl>

<dt>x</dt>
<dd>the tokens object to be recompiled</dd>


<dt>method</dt>
<dd><code>"C++"</code> for C++ implementation or <code>"R"</code> for an older
R-based method</dd>


<dt>gap</dt>
<dd>if <code>TRUE</code>, remove gaps between token IDs</dd>


<dt>dup</dt>
<dd>if <code>TRUE</code>, merge duplicated token types into the same ID</dd>

</dl>

recompile a serialized tokens object

tokens_recompile: recompile a serialized tokens object

Description

Usage

Arguments

Examples