dfm_select

dfm_remove

dfm_keep

fcm_select

fcm_remove

fcm_keep

the <a rd-options="" href="/link/dfm?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::dfm">dfm</a> or <a rd-options="" href="/link/fcm?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::fcm">fcm</a> object whose features will be selected

a character vector, list of character vectors,
<a rd-options="" href="/link/dictionary?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::dictionary">dictionary</a>, or <a rd-options="" href="/link/collocations?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::collocations">collocations</a> object. See <a rd-options="" href="/link/pattern?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::pattern">pattern</a> for
details.

pattern

whether to <code>keep</code> or <code>remove</code> the features

selection

the type of pattern matching: <code>"glob"</code> for "glob"-style
wildcard expressions; <code>"regex"</code> for regular expressions; or <code>"fixed"</code> for
exact matching. See <a rd-options="" href="/link/valuetype?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::valuetype">valuetype</a> for details.

valuetype

logical; if <code>TRUE</code>, ignore case when matching a
<code>pattern</code> or <a rd-options="" href="/link/dictionary?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::dictionary">dictionary</a> values

case_insensitive

optional numerics specifying the minimum and
maximum length in characters for tokens to be removed or kept; defaults are
<code>NULL</code> for no limits. These are applied after (and hence, in addition
to) any selection based on pattern matches.

min_nchar, max_nchar

if <code>TRUE</code> print message about how many pattern were
removed

verbose

used only for passing arguments from <code>dfm_remove</code> or
<code>dfm_keep</code> to <code>dfm_select</code>. Cannot include
<code>selection</code>.

This function selects or removes features from a <a rd-options="" href="/link/dfm?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::dfm">dfm</a> or <a rd-options="" href="/link/fcm?package=quanteda&version=2.1.2" data-mini-rdoc="quanteda::fcm">fcm</a>,
based on feature name matches with <code>pattern</code>. The most common usages
are to eliminate features from a dfm already constructed, such as stopwords,
or to select only terms of interest from a dictionary.

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and ngrams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Haiyan Wang

Paul Nulty

Adam Obeng

Stefan M<c3><bc>ller

Akitaka Matsuo

Jiong Wei Lua

Jouni Kuha

William Lowe

Christian M<c3><bc>ller

Lori Young

Stuart Soroka

Ian Fellows

European Research Council 

dfm_select function

the <a rd-options='' href='dfm'>dfm</a> or <a rd-options='' href='fcm'>fcm</a> object whose features will be selected

a character vector, list of character vectors,
<a rd-options='' href='dictionary'>dictionary</a>, or <a rd-options='' href='collocations'>collocations</a> object. See <a rd-options='' href='pattern'>pattern</a> for
details.

the type of pattern matching: <code>"glob"</code> for "glob"-style
wildcard expressions; <code>"regex"</code> for regular expressions; or <code>"fixed"</code> for
exact matching. See <a rd-options='' href='valuetype'>valuetype</a> for details.

logical; if <code>TRUE</code>, ignore case when matching a
<code>pattern</code> or <a rd-options='' href='dictionary'>dictionary</a> values

This function selects or removes features from a <a rd-options='' href='dfm'>dfm</a> or <a rd-options='' href='fcm'>fcm</a>,
based on feature name matches with <code>pattern</code>. The most common usages
are to eliminate features from a dfm already constructed, such as stopwords,
or to select only terms of interest from a dictionary.

dfm_select: Select features from a dfm or fcm

Description

Usage

Arguments

Value

Details

See Also

Examples