This function applies the method described in Landauer & Dumais (1997): The local coherence is the cosine
between two adjacent sentences. The global coherence is then computed as the mean value of these local
coherences.
The format of x
should be of the kind x <- "sentence1. sentence2. sentence3"
Every sentence can also just consist of one single word.
To import a document Document.txt to from a directory for coherence computation, set your working
directory to this directory using setwd()
. Then use the following command lines:
fileName1 <- "Alice_in_Wonderland.txt"
x <- readChar(fileName1, file.info(fileName1)$size)
In the traditional LSA approach, the vector D for a document (or a sentence) consisting of the words (t1, . , tn) is computed as
$$D = \sum\limits_{i=1}^n t_n$$
This is the default method (method="Add"
) for this function. Alternatively, this function provided the possibility of computing the document vector from its word vectors using element-wise multiplication (see Mitchell & Lapata, 2010 and compose
).
A note will be displayed whenever not all words of one input string are found in the semantic space. Caution: In that case, the function will still produce a result, by omitting the words not found in the semantic space. Depending on the specific requirements of a task, this may compromise the results. Please check your input when you receive this message.
A warning message will be displayed whenever no word of one input string is found in the semantic space.