get_tokens

Parses a string into a vector of word tokens.

Extracts sentiment and sentiment-derived plot arcs
from text using a variety of sentiment dictionaries conveniently
packaged for consumption by R users.  Implemented dictionaries include
"syuzhet" (default) developed in the Nebraska Literary Lab
"afinn" developed by Finn Årup Nielsen, "bing" developed by Minqing Hu
and Bing Liu, and "nrc" developed by Mohammad, Saif M. and Turney, Peter D.
Applicable references are available in README.md and in the documentation
for the "get_sentiment" function.  The package also provides a hack for
implementing Stanford's coreNLP sentiment parser. The package provides
several methods for plot arc normalization.

Matthew Jockers

syuzhet

Extracts Sentiment and Sentiment-Derived Plot Arcs from Text

get_tokens function

<dl><dt>text_of_file</dt>
<dd>A Text String</dd>
<dt>pattern</dt>
<dd>A regular expression for token breaking</dd>
<dt>lowercase</dt>
<dd>should tokens be converted to lowercase. Default equals TRUE</dd></dl>

Arguments

Word Tokenization — get_tokens

<dl>

<dt>text_of_file</dt>
<dd>A Text String</dd>


<dt>pattern</dt>
<dd>A regular expression for token breaking</dd>


<dt>lowercase</dt>
<dd>should tokens be converted to lowercase. Default equals TRUE</dd>

</dl>

get_tokens: Word Tokenization

Description

Usage

Value

Arguments