tokens_recompile

the <a rd-options="" href="/link/tokens?package=quanteda&version=2.0.1" data-mini-rdoc="quanteda::tokens">tokens</a> object to be recompiled

<code>"C++"</code> for C++ implementation or <code>"R"</code> for an older
R-based method

method

if <code>TRUE</code>, remove gaps between token IDs

if <code>TRUE</code>, merge duplicated token types into the same ID

This function recompiles a serialized tokens object when the vocabulary has
been changed in a way that makes some of its types identical, such as
lowercasing when a lowercased version of the type already exists in the type
table, or introduces gaps in the integer map of the types. It also
re-indexes the types attribute to account for types that may have become
duplicates, through a procedure such as stemming or lowercasing; or the
addition of new tokens through compounding.

internal

tokens

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and ngrams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Haiyan Wang

Paul Nulty

Adam Obeng

Stefan M<c3><bc>ller

Akitaka Matsuo

Jiong Wei Lua

Jouni Kuha

William Lowe

Christian M<c3><bc>ller

Lori Young

Stuart Soroka

Ian Fellows

European Research Council 

tokens_recompile function

the <a rd-options='' href='tokens'>tokens</a> object to be recompiled

tokens_recompile: recompile a serialized tokens object

Description

Usage

Arguments

Examples