freq.analysis

freq.analysis,kRp.taggedText-method

freq.analysis,character-method

Either an object of class <code><a rd-options="koRpus:kRp.tagged-class" href="/link/kRp.tagged?package=koRpus&version=0.11-5&to=koRpus%3AkRp.tagged-class" data-mini-rdoc="koRpus:kRp.tagged-class::kRp.tagged">kRp.tagged</a></code>,
 <code><a rd-options="koRpus:kRp.txt.freq-class" href="/link/kRp.txt.freq?package=koRpus&version=0.11-5&to=koRpus%3AkRp.txt.freq-class" data-mini-rdoc="koRpus:kRp.txt.freq-class::kRp.txt.freq">kRp.txt.freq</a></code>,
<code><a rd-options="koRpus:kRp.analysis-class" href="/link/kRp.analysis?package=koRpus&version=0.11-5&to=koRpus%3AkRp.analysis-class" data-mini-rdoc="koRpus:kRp.analysis-class::kRp.analysis">kRp.analysis</a></code> or <code><a rd-options="koRpus:kRp.txt.trans-class" href="/link/kRp.txt.trans?package=koRpus&version=0.11-5&to=koRpus%3AkRp.txt.trans-class" data-mini-rdoc="koRpus:kRp.txt.trans-class::kRp.txt.trans">kRp.txt.trans</a></code>,
 or a character vector which must
be a valid path to a file containing the text to be analyzed.

txt.file

Additional options to be passed through to the function defined with <code>tagger</code>.

An object of class <code><a rd-options="koRpus:kRp.corp.freq-class" href="/link/kRp.corp.freq?package=koRpus&version=0.11-5&to=koRpus%3AkRp.corp.freq-class" data-mini-rdoc="koRpus:kRp.corp.freq-class::kRp.corp.freq">kRp.corp.freq</a></code>.

corp.freq

Logical, whether a descriptive statistical analysis should be performed.

desc.stat

A character string defining the language to be assumed for the text,
 by force.

force.lang

A character string defining the tokenizer/tagger command you want to use for basic text analysis. Can be omitted if
<code>txt.file</code> is already of class <code>kRp.tagged-class</code>. Defaults to <code>"kRp.env"</code> to get the settings by
<code><a rd-options="koRpus:get.kRp.env" href="/link/get.kRp.env?package=koRpus&version=0.11-5&to=koRpus%3Aget.kRp.env" data-mini-rdoc="koRpus:get.kRp.env::get.kRp.env">get.kRp.env</a></code>. Set to <code>"tokenize"</code> to use <code><a rd-options="koRpus:tokenize" href="/link/tokenize?package=koRpus&version=0.11-5&to=koRpus%3Atokenize" data-mini-rdoc="koRpus:tokenize::tokenize">tokenize</a></code>.

tagger

A character vector with word classes which should be ignored for frequency analysis. The default value
<code>"nonpunct"</code> has special meaning and will cause the result of
<code>kRp.POS.tags(lang, c("punct","sentc"), list.classes=TRUE)</code> to be used.

corp.rm.class

A character vector with POS tags which should be ignored for frequency analysis.

corp.rm.tag

Logical,
 whether the term frequency--inverse document frequency statistic (tf-idf) should be computed. Requires
<code>corp.freq</code> to provide appropriate idf values for the types in <code>txt.file</code>. Missing idf values will result in <code>NA</code>.

tfidf

The function <code>freq.analysis</code> analyzes texts regarding frequencies of tokens,
 word classes etc.

misc

A set of tools to analyze texts. Includes, amongst others,
functions for automatic language detection, hyphenation,
several indices of lexical diversity (e.g., type token ratio,
HD-D/vocd-D, MTLD) and readability (e.g., Flesch, SMOG, LIX,
Dale-Chall). Basic import functions for language corpora are
also provided, to enable frequency analyses (supports Celex and
Leipzig Corpora Collection file formats) and measures like
tf-idf. Note: For full functionality a local installation of
TreeTagger is recommended. It is also recommended to not load
this package directly, but by loading one of the available
language support packages from the 'l10n' repository
<https://undocumeantit.github.io/repos/l10n>. 'koRpus' also
includes a plugin for the R GUI and IDE RKWard, providing
graphical dialogs for its basic features. The respective R
package 'rkward' cannot be installed directly from a
repository, as it is a part of RKWard. To make full use of this
feature, please install RKWard from <https://rkward.kde.org>
(plugins are detected automatically). Due to some restrictions
on CRAN, the full package sources are only available from the
project homepage. To ask for help, report bugs, request
features, or discuss the development of the package, please
subscribe to the koRpus-dev mailing list
(<http://korpusml.reaktanz.de>).

Meik Michalke

koRpus

An R Package for Text Analysis

Earl Brown

Alberto Mirisola

Alexandre Brulet

Laura Hauser

freq.analysis function

Either an object of class <code><a rd-options='koRpus:kRp.tagged-class' href='kRp.tagged'>kRp.tagged</a></code>,
 <code><a rd-options='koRpus:kRp.txt.freq-class' href='kRp.txt.freq'>kRp.txt.freq</a></code>,
<code><a rd-options='koRpus:kRp.analysis-class' href='kRp.analysis'>kRp.analysis</a></code> or <code><a rd-options='koRpus:kRp.txt.trans-class' href='kRp.txt.trans'>kRp.txt.trans</a></code>,
 or a character vector which must
be a valid path to a file containing the text to be analyzed.

An object of class <code><a rd-options='koRpus:kRp.corp.freq-class' href='kRp.corp.freq'>kRp.corp.freq</a></code>.

A character string defining the tokenizer/tagger command you want to use for basic text analysis. Can be omitted if
<code>txt.file</code> is already of class <code>kRp.tagged-class</code>. Defaults to <code>"kRp.env"</code> to get the settings by
<code><a rd-options='koRpus:get.kRp.env' href='get.kRp.env'>get.kRp.env</a></code>. Set to <code>"tokenize"</code> to use <code><a rd-options='koRpus:tokenize' href='tokenize'>tokenize</a></code>.

freq.analysis: Analyze word frequencies

Description

Usage

Arguments

Value

Details

See Also

Examples