Learn R Programming

softmaxreg (version 1.2)

wordEmbed: Embed Words to Vectors Using Pre-trained Word2vec Dictionary

Description

Embed words in string to vectors using the pre-trained word2vec dictionary. User can also replace the word2vec dataframe with customized data.

Usage

wordEmbed(object, dictionary, meanVec)

Arguments

object
Vectors of text representing documents.
dictionary
Dataframe of pre-trained word2vec dataset. The First column is the word and the following columns are numeric vectors from word2vec models. The default dataset with the package is a pre-trained 20 dimension word2vec dataset.
meanVec
Boolean variable. If meanVec is TRUE, a matrix is returned with each row representing the mean of numeric vectors of all the words in a document. If FALSE, a list of matrix is returned in which each document is represented by a matrix.

Value

wordEmbed returns a matrix if meanVec is TRUE and a list of matrix if meanVec is FALSE.

See Also

document word2vec

Examples

Run this code
data(word2vec) # load default 20 dimensions word2vec dataset
doc = "This is an example line of document"
docVectors = wordEmbed(doc, word2vec, meanVec = TRUE)


Run the code above in your browser using DataLab