kRp.tagged,-class

kRp_tagged

kRp.tagged-class

This class is used for objects that are returned by <code><a rd-options="koRpus:treetag" href="/link/treetag?package=koRpus&version=0.11-5&to=koRpus%3Atreetag" data-mini-rdoc="koRpus:treetag::treetag">treetag</a></code> or <code><a rd-options="koRpus:tokenize" href="/link/tokenize?package=koRpus&version=0.11-5&to=koRpus%3Atokenize" data-mini-rdoc="koRpus:tokenize::tokenize">tokenize</a></code>.

classes

A set of tools to analyze texts. Includes, amongst others,
functions for automatic language detection, hyphenation,
several indices of lexical diversity (e.g., type token ratio,
HD-D/vocd-D, MTLD) and readability (e.g., Flesch, SMOG, LIX,
Dale-Chall). Basic import functions for language corpora are
also provided, to enable frequency analyses (supports Celex and
Leipzig Corpora Collection file formats) and measures like
tf-idf. Note: For full functionality a local installation of
TreeTagger is recommended. It is also recommended to not load
this package directly, but by loading one of the available
language support packages from the 'l10n' repository
<https://undocumeantit.github.io/repos/l10n>. 'koRpus' also
includes a plugin for the R GUI and IDE RKWard, providing
graphical dialogs for its basic features. The respective R
package 'rkward' cannot be installed directly from a
repository, as it is a part of RKWard. To make full use of this
feature, please install RKWard from <https://rkward.kde.org>
(plugins are detected automatically). Due to some restrictions
on CRAN, the full package sources are only available from the
project homepage. To ask for help, report bugs, request
features, or discuss the development of the package, please
subscribe to the koRpus-dev mailing list
(<http://korpusml.reaktanz.de>).

Meik Michalke

koRpus

An R Package for Text Analysis

Earl Brown

Alberto Mirisola

Alexandre Brulet

Laura Hauser

kRp.tagged,-class function

<dl class="dl-horizontal">
<dt><code>lang</code></dt><dd>A character string,
 naming the language that is assumed for the tokenized text in this object.</dd></dl><dt><code>desc</code></dt><dd>Descriptive statistics of the tagged text.</dd><dt><code>TT.res</code></dt><dd>Results of the called tokenizer and POS tagger. The data.frame has eight columns:
<dl class="dl-horizontal">
 <dt><code>doc_id</code>:</dt><dd>Optional document identifier.</dd>
 <dt><code>token</code>:</dt><dd>The tokenized text.</dd>
 <dt><code>tag</code>:</dt><dd>POS tags for each token.</dd>
 <dt><code>lemma</code>:</dt><dd>Lemma for each token.</dd>
 <dt><code>lttr</code>:</dt><dd>Number of letters.</dd>
 <dt><code>wclass</code>:</dt><dd>Word class.</dd>
 <dt><code>desc</code>:</dt><dd>A short description of the POS tag.</dd>
 <dt><code>stop</code>:</dt><dd>Logical, <code>TRUE</code> if token is a stopword.</dd>
 <dt><code>stem</code>:</dt><dd>Stemmed token.</dd>
 <dt><code>idx</code>:</dt><dd>Index number of token in this document.</dd>
 <dt><code>sntc</code>:</dt><dd>Number of sentence in this document.</dd>
</dl>
This data.frame structure adheres to the "Text Interchange Formats" guidelines set out by rOpenSci[1].</dd>

Slots

Should you need to manually generate objects of this class (which should rarely be the case),
 the contructor function 
<code>kRp_tagged(...)</code> can be used instead of
<code>new("kRp.tagged", ...)</code>.

Contructor function

This class is used for objects that are returned by <code><a rd-options='koRpus:treetag' href='treetag'>treetag</a></code> or <code><a rd-options='koRpus:tokenize' href='tokenize'>tokenize</a></code>.

kRp.tagged,-class: S4 Class kRp.tagged

Description

Arguments

Slots

Contructor function

References