glove: Fit a GloVe word-embedded model

Description

DEPRECIATED.This function trains a GloVe word-embeddings model via fully asynchronous and parallel AdaGrad.

Usage

glove(tcm, vocabulary_size = nrow(tcm), word_vectors_size, x_max, num_iters,
  shuffle_seed = NA_integer_, learning_rate = 0.05,
  convergence_threshold = -1, grain_size = 100000L, alpha = 0.75, ...)

Arguments

tcm

an object which represents a term-co-occurrence matrix, which is used in training. At the moment only dgTMatrix or objects coercible to a dgTMatrix) are supported. In future releases we will add support for out-of-core learning and streaming a TCM from disk.

vocabulary_size

number of words in in the term-co-occurrence matrix

word_vectors_size

desired dimension for word vectors

x_max

maximum number of co-occurrences to use in the weighting function. See the GloVe paper for details: http://nlp.stanford.edu/pubs/glove.pdf.

num_iters

number of AdaGrad epochs

shuffle_seed

integer seed. Use NA_integer_ to turn shuffling off. A seed defines shuffling before each SGD iteration. Parameter only controls shuffling before each SGD iteration. Result still will be unpredictable (because of Hogwild style async SGD)! Generally shuffling is a good idea for stochastic-gradient descent, but from my experience in this particular case it does not improve convergence. By default there is no shuffling. Please report if you find that shuffling improves your score.

learning_rate

learning rate for SGD. I do not recommend that you modify this parameter, since AdaGrad will quickly adjust it to optimal.

convergence_threshold

defines early stopping strategy. We stop fitting when one of two following conditions will be satisfied: (a) we have used all iterations, or (b) cost_previous_iter / cost_current_iter - 1 < convergence_threshold.

grain_size

I do not recommend adjusting this parameter. This is the grain_size for RcppParallel::parallelReduce. For details, see http://rcppcore.github.io/RcppParallel/#grain-size.

alpha

the alpha in weighting function formula : \(f(x) = 1 if x > x_max; else (x/x_max)^alpha\)

...

arguments passed to other methods (not used at the moment).