hyphen

hyphen,kRp.taggedText-method

hyphen,character-method

Either an object of class <code><a rd-options="koRpus" href="/link/kRp.tagged-class?package=koRpus&version=0.10-2&to=koRpus" data-mini-rdoc="koRpus::kRp.tagged-class">kRp.tagged-class</a></code>,
 <code><a rd-options="koRpus" href="/link/kRp.txt.freq-class?package=koRpus&version=0.10-2&to=koRpus" data-mini-rdoc="koRpus::kRp.txt.freq-class">kRp.txt.freq-class</a></code> or
<code><a rd-options="koRpus" href="/link/kRp.analysis-class?package=koRpus&version=0.10-2&to=koRpus" data-mini-rdoc="koRpus::kRp.analysis-class">kRp.analysis-class</a></code>,
 or a character vector with words to be hyphenated.

words

Either an object of class <code><a rd-options="koRpus" href="/link/kRp.hyph.pat-class?package=koRpus&version=0.10-2&to=koRpus" data-mini-rdoc="koRpus::kRp.hyph.pat-class">kRp.hyph.pat-class</a></code>, or
a valid character string naming the language of the patterns to be used. See details.

hyph.pattern

Integer,
 number of letters a word must have for considering a hyphenation. <code>hyphen</code> will
not split words after the first or before the last letter,
 so values smaller than 4 are not useful.

min.length

Logical,
 whether appearing hyphens in words should be removed before pattern matching.

rm.hyph

A character vector with word classes which should be ignored. The default value
<code>"nonpunct"</code> has special meaning and will cause the result of
<code>kRp.POS.tags(lang, c("punct","sentc"),
 list.classes=TRUE)</code> to be used. Relevant only if <code>words</code>
is a valid koRpus object.

corp.rm.class

A character vector with POS tags which should be ignored. Relevant only if <code>words</code>
is a valid koRpus object.

corp.rm.tag

Logical. If <code>FALSE</code>, short status messages will be shown.

quiet

Logical. <code>hyphen()</code> can cache results to speed up the process. If this option is set to <code>TRUE</code>,
 the
current cache will be queried and new tokens also be added. Caches are language-specific and reside in an environment,
i.e., they are cleaned at the end of a session. If you want to save these for later use,
 see the option <code>hyph.cache.file</code>
in <code><a rd-options="koRpus:set.kRp.env" href="/link/set.kRp.env?package=koRpus&version=0.10-2&to=koRpus%3Aset.kRp.env" data-mini-rdoc="koRpus:set.kRp.env::set.kRp.env">set.kRp.env</a></code>.

cache

These methods implement word hyphenation, based on Liang's algorithm.

hyphenation

A set of tools to analyze texts. Includes, amongst others, functions for automatic language detection, hyphenation,
several indices of lexical diversity (e.g., type token ratio, HD-D/vocd-D, MTLD) and readability (e.g., Flesch, SMOG,
LIX, Dale-Chall). Basic import functions for language corpora are also provided, to enable frequency analyses (supports
Celex and Leipzig Corpora Collection file formats) and measures like tf-idf. Support for additional languages can be
added on-the-fly or by plugin packages. Note: For full functionality a local installation of TreeTagger is recommended.
'koRpus' also includes a plugin for the R GUI and IDE RKWard, providing graphical dialogs for its basic features. The
respective R package 'rkward' cannot be installed directly from a repository, as it is a part of RKWard. To make full
use of this feature, please install RKWard from <https://rkward.kde.org> (plugins are detected automatically). Due to
some restrictions on CRAN, the full package sources are only available from the project homepage. To ask for help,
report bugs, request features, or discuss the development of the package, please subscribe to the koRpus-dev mailing
list (<http://korpusml.reaktanz.de>).

Meik Michalke

koRpus

An R Package for Text Analysis

m.eik michalke

Earl Brown

Alberto Mirisola

Alexandre Brulet

Laura Hauser

hyphen function

Either an object of class <code><a rd-options='koRpus' href='kRp.tagged-class'>kRp.tagged-class</a></code>,
 <code><a rd-options='koRpus' href='kRp.txt.freq-class'>kRp.txt.freq-class</a></code> or
<code><a rd-options='koRpus' href='kRp.analysis-class'>kRp.analysis-class</a></code>,
 or a character vector with words to be hyphenated.

Either an object of class <code><a rd-options='koRpus' href='kRp.hyph.pat-class'>kRp.hyph.pat-class</a></code>, or
a valid character string naming the language of the patterns to be used. See details.

Logical. <code>hyphen()</code> can cache results to speed up the process. If this option is set to <code>TRUE</code>,
 the
current cache will be queried and new tokens also be added. Caches are language-specific and reside in an environment,
i.e., they are cleaned at the end of a session. If you want to save these for later use,
 see the option <code>hyph.cache.file</code>
in <code><a rd-options='koRpus:set.kRp.env' href='set.kRp.env'>set.kRp.env</a></code>.

hyphen: Automatic hyphenation

Description

Usage

Arguments

Value

Details

References

See Also

Examples