corpus_trimsentences

char_trimsentences

<a rd-options="" href="/link/corpus?package=quanteda&version=2.0.1" data-mini-rdoc="quanteda::corpus">corpus</a> or character object whose sentences will be selected.

minimum and maximum lengths in word tokens
(excluding punctuation)

min_length, max_length

a stringi regular expression whose match (at the
sentence level) will be used to exclude sentences

exclude_pattern

if <code>TRUE</code>, return tokens object of sentences after
trimming, otherwise return the input object type with the trimmed sentences
removed.

return_tokens

Removes sentences from a corpus or a character vector shorter than a
specified length.

internal

deprecated

A fast, flexible, and comprehensive framework for
quantitative text analysis in R.  Provides functionality for corpus management,
creating and manipulating tokens and ngrams, exploring keywords in context,
forming and manipulating sparse matrices
of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and
distances, applying content dictionaries, applying supervised and unsupervised machine learning,
visually representing text and text analyses, and more.

Kenneth Benoit

quanteda

Quantitative Analysis of Textual Data

Kohei Watanabe

Haiyan Wang

Paul Nulty

Adam Obeng

Stefan M<c3><bc>ller

Akitaka Matsuo

Jiong Wei Lua

Jouni Kuha

William Lowe

Christian M<c3><bc>ller

Lori Young

Stuart Soroka

Ian Fellows

European Research Council 

corpus_trimsentences function

<a rd-options='' href='corpus'>corpus</a> or character object whose sentences will be selected.

corpus_trimsentences: Remove sentences based on their token lengths or a pattern match

Description

Usage

Arguments

Value

Examples