wordEmbed:
Embed Words to Vectors Using Pre-trained Word2vec Dictionary
Description
Embed words in string to vectors using the pre-trained word2vec dictionary.
User can also replace the word2vec dataframe with customized data.
Usage
wordEmbed(object, dictionary, meanVec)
Arguments
object
Vectors of text representing documents.
dictionary
Dataframe of pre-trained word2vec dataset. The First column is the word and the following columns are numeric vectors from word2vec models. The default dataset with the package is a pre-trained 20 dimension word2vec dataset.
meanVec
Boolean variable. If meanVec is TRUE, a matrix is returned with each row representing the mean of numeric vectors of all the words in a document. If FALSE, a list of matrix is returned in which each document is represented by a matrix.
Value
wordEmbed returns a matrix if meanVec is TRUE and a list of matrix if meanVec is FALSE.
data(word2vec) # load default 20 dimensions word2vec datasetdoc = "This is an example line of document"docVectors = wordEmbed(doc, word2vec, meanVec = TRUE)