read.corp.custom

read.corp.custom,kRp.text-method

An object of class <code>kRp.text</code> (then the column <code>"token"</code> of the <code>tokens</code> slot is used).

corpus

Logical. If <code>FALSE</code>,
 all tokens will be matched in their lower case form.

caseSens

A numeric value defining the base of the logarithm used for inverse document frequency (idf). See
<code><a rd-options="base:log" href="/link/log?package=koRpus&version=0.13-4&to=base%3Alog" data-mini-rdoc="base:log::log">log</a></code> for details.

log.base

Additional options for methods of the generic.

A document term matrix of the <code>corpus</code> object as generated by <code><a rd-options="koRpus:docTermMatrix" href="/link/docTermMatrix?package=koRpus&version=0.13-4&to=koRpus%3AdocTermMatrix" data-mini-rdoc="koRpus:docTermMatrix::docTermMatrix">docTermMatrix</a></code>.
This argument merely exists for cases where you want to re-use an already existing matrix.
By default, it is being created from the <code>corpus</code> object.

Logical,
 whether the output should be just the analysis results or the input object with
the results added as a feature. Use <code><a rd-options="koRpus:corpusCorpFreq" href="/link/corpusCorpFreq?package=koRpus&version=0.13-4&to=koRpus%3AcorpusCorpFreq" data-mini-rdoc="koRpus:corpusCorpFreq::corpusCorpFreq">corpusCorpFreq</a></code>
to get the results from such an aggregated object.

as.feature

Read data from a custom corpus into a valid object of class <code><a rd-options="koRpus:kRp.corp.freq-class" href="/link/kRp.corp.freq?package=koRpus&version=0.13-4&to=koRpus%3AkRp.corp.freq-class" data-mini-rdoc="koRpus:kRp.corp.freq-class::kRp.corp.freq">kRp.corp.freq</a></code>.

corpora

A set of tools to analyze texts. Includes, amongst others, functions for
automatic language detection, hyphenation, several indices of lexical diversity
(e.g., type token ratio, HD-D/vocd-D, MTLD) and readability (e.g., Flesch,
SMOG, LIX, Dale-Chall). Basic import functions for language corpora are also
provided, to enable frequency analyses (supports Celex and Leipzig Corpora
Collection file formats) and measures like tf-idf. Note: For full functionality
a local installation of TreeTagger is recommended. It is also recommended to
not load this package directly, but by loading one of the available language
support packages from the 'l10n' repository
<https://undocumeantit.github.io/repos/l10n/>. 'koRpus' also includes a plugin
for the R GUI and IDE RKWard, providing graphical dialogs for its basic
features. The respective R package 'rkward' cannot be installed directly from a
repository, as it is a part of RKWard. To make full use of this feature, please
install RKWard from <https://rkward.kde.org> (plugins are detected
automatically). Due to some restrictions on CRAN, the full package sources are
only available from the project homepage. To ask for help, report bugs, request
features, or discuss the development of the package, please subscribe to the
koRpus-dev mailing list (<https://korpusml.reaktanz.de>).

Meik Michalke

koRpus

Text Analysis with Emphasis on POS Tagging, Readability and
Lexical Diversity

Earl Brown

Alberto Mirisola

Alexandre Brulet

Laura Hauser

read.corp.custom function

A numeric value defining the base of the logarithm used for inverse document frequency (idf). See
<code><a rd-options='base:log' href='log'>log</a></code> for details.

A document term matrix of the <code>corpus</code> object as generated by <code><a rd-options='koRpus:docTermMatrix' href='docTermMatrix'>docTermMatrix</a></code>.
This argument merely exists for cases where you want to re-use an already existing matrix.
By default, it is being created from the <code>corpus</code> object.

Logical,
 whether the output should be just the analysis results or the input object with
the results added as a feature. Use <code><a rd-options='koRpus:corpusCorpFreq' href='corpusCorpFreq'>corpusCorpFreq</a></code>
to get the results from such an aggregated object.

Read data from a custom corpus into a valid object of class <code><a rd-options='koRpus:kRp.corp.freq-class' href='kRp.corp.freq'>kRp.corp.freq</a></code>.

read.corp.custom: Import custom corpus data

Description

Usage

Arguments

Value

Details

See Also

Examples