toDocumentTermMatrix

<code><a rd-options="tm" href="/link/Corpus?package=SentimentAnalysis&version=1.3-4&to=tm" data-mini-rdoc="tm::Corpus">Corpus</a></code> object which should be processed

Default language used for preprocessing (i.e. stop word removal and stemming)

language

Minimum length of words used for cut-off; i.e. shorter words are 
removed. Default is 3.

minWordLength

A numeric for the maximal allowed sparsity in the range from bigger zero to 
smaller one. Default is <code>NULL</code> in order suppress this functionality.

sparsity

Flag indicating whether to remove stopwords or not (default: yes)

removeStopwords

stemming

Function used for weighting of words; default is a a link to the tf-idf scheme.

weighting

Preprocess existing corpus of type <code><a href="/link/Corpus?package=SentimentAnalysis&version=1.3-4" data-mini-rdoc="SentimentAnalysis::Corpus">Corpus</a></code> according to default operations. 
This helper function groups all standard preprocessing steps such that the usage of the 
package is more convenient. The result is a document-term matrix.

preprocessing

corpus

Performs a sentiment analysis of textual contents in R. This implementation
utilizes various existing dictionaries, such as Harvard IV, or finance-specific
dictionaries. Furthermore, it can also create customized dictionaries. The latter
uses LASSO regularization as a statistical approach to select relevant terms based on
an exogenous response variable.

Nicolas Proellochs

SentimentAnalysis

Dictionary-Based Sentiment Analysis

Stefan Feuerriegel

toDocumentTermMatrix function

<dl><dt>x</dt>
<dd><code><a href="/link/Corpus?package=SentimentAnalysis&version=1.3-4" data-mini-rdoc="SentimentAnalysis::Corpus">Corpus</a></code> object which should be processed</dd>
<dt>language</dt>
<dd>Default language used for preprocessing (i.e. stop word removal and stemming)</dd>
<dt>minWordLength</dt>
<dd>Minimum length of words used for cut-off; i.e. shorter words are 
removed. Default is 3.</dd>
<dt>sparsity</dt>
<dd>A numeric for the maximal allowed sparsity in the range from bigger zero to 
smaller one. Default is <code>NULL</code> in order suppress this functionality.</dd>
<dt>removeStopwords</dt>
<dd>Flag indicating whether to remove stopwords or not (default: yes)</dd>
<dt>stemming</dt>
<dd>Perform stemming (default: TRUE)</dd>
<dt>weighting</dt>
<dd>Function used for weighting of words; default is a a link to the tf-idf scheme.</dd></dl>

Arguments

Preprocess existing corpus of type <code><a href='https://rdrr.io/pkg/tm/man/Corpus.html'>Corpus</a></code> according to default operations. 
This helper function groups all standard preprocessing steps such that the usage of the 
package is more convenient. The result is a document-term matrix.

Default preprocessing of corpus and conversion to document-term matrix — toDocumentTermMatrix

<dl>

<dt>x</dt>
<dd><code><a href='https://rdrr.io/pkg/tm/man/Corpus.html'>Corpus</a></code> object which should be processed</dd>


<dt>language</dt>
<dd>Default language used for preprocessing (i.e. stop word removal and stemming)</dd>


<dt>minWordLength</dt>
<dd>Minimum length of words used for cut-off; i.e. shorter words are 
removed. Default is 3.</dd>


<dt>sparsity</dt>
<dd>A numeric for the maximal allowed sparsity in the range from bigger zero to 
smaller one. Default is <code>NULL</code> in order suppress this functionality.</dd>


<dt>removeStopwords</dt>
<dd>Flag indicating whether to remove stopwords or not (default: yes)</dd>


<dt>stemming</dt>
<dd>Perform stemming (default: TRUE)</dd>


<dt>weighting</dt>
<dd>Function used for weighting of words; default is a a link to the tf-idf scheme.</dd>

</dl>

toDocumentTermMatrix: Default preprocessing of corpus and conversion to document-term matrix

Description

Usage

Value

Arguments

See Also