predict.word2vec: Predict functionalities for a word2vec model

Description

Get either

the embedding of words
the nearest words which are similar to either a word or a word vector

Usage

# S3 method for word2vec
predict(
  object,
  newdata,
  type = c("nearest", "embedding"),
  top_n = 10L,
  encoding = "UTF-8",
  ...
)

Value

depending on the type, you get a different result back:

for type nearest: a list of data.frames with columns term, similarity and rank indicating with words which are closest to the provided newdata words or word vectors. If newdata is just one vector instead of a matrix, it returns a data.frame
for type embedding: a matrix of word vectors of the words provided in newdata

Arguments

object: a word2vec model as returned by word2vec or read.word2vec
newdata: for type 'embedding', newdata should be a character vector of words
for type 'nearest', newdata should be a character vector of words or a matrix in the embedding space
type: either 'embedding' or 'nearest'. Defaults to 'nearest'.
top_n: show only the top n nearest neighbours. Defaults to 10.
encoding: set the encoding of the text elements to the specified encoding. Defaults to 'UTF-8'.
...: not used

Examples

Run this code

path  <- system.file(package = "word2vec", "models", "example.bin")
model <- read.word2vec(path)
emb <- predict(model, c("bus", "toilet", "unknownword"), type = "embedding")
emb
nn  <- predict(model, c("bus", "toilet"), type = "nearest", top_n = 5)
nn

# Do some calculations with the vectors and find similar terms to these
emb <- as.matrix(model)
vector <- emb["buurt", ] - emb["rustige", ] + emb["restaurants", ]
predict(model, vector, type = "nearest", top_n = 10)

vector <- emb["gastvrouw", ] - emb["gastvrij", ]
predict(model, vector, type = "nearest", top_n = 5)

vectors <- emb[c("gastheer", "gastvrouw"), ]
vectors <- rbind(vectors, avg = colMeans(vectors))
predict(model, vectors, type = "nearest", top_n = 10)

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples