Learn R Programming

word2vec (version 0.4.0)

read.word2vec: Read a binary word2vec model from disk

Description

Read a binary word2vec model from disk

Usage

read.word2vec(file, normalize = FALSE)

Value

an object of class w2v which is a list with elements

  • model: a Rcpp pointer to the model

  • model_path: the path to the model on disk

  • dim: the dimension of the embedding matrix

  • n: the number of words in the vocabulary

Arguments

file

the path to the model file

normalize

logical indicating to normalize the embeddings by dividing by the factor (sqrt(sum(x . x) / length(x))). Defaults to FALSE.

Examples

Run this code
path  <- system.file(package = "word2vec", "models", "example.bin")
model <- read.word2vec(path)
vocab <- summary(model, type = "vocabulary")
emb <- predict(model, c("bus", "naar", "unknownword"), type = "embedding")
emb
nn  <- predict(model, c("bus", "toilet"), type = "nearest")
nn

# Do some calculations with the vectors and find similar terms to these
emb <- as.matrix(model)
vector <- emb["gastvrouw", ] - emb["gastvrij", ]
predict(model, vector, type = "nearest", top_n = 5)
vectors <- emb[c("gastheer", "gastvrouw"), ]
vectors <- rbind(vectors, avg = colMeans(vectors))
predict(model, vectors, type = "nearest", top_n = 10)

Run the code above in your browser using DataLab