Learn R Programming

lda (version 1.1)

nubbi.collapsed.gibbs.sampler: Collapsed Gibbs Sampling for the Networks Uncovered By Bayesian Inference (NUBBI) Model.

Description

Fit a NUBBI model, which takes as input a collection of entities with corresponding textual descriptions as well as a set of descriptions for pairs of entities. The NUBBI model the produces a latent space description of both the entities and the relationships between them.

Usage

nubbi.collapsed.gibbs.sampler(contexts, pair.contexts, pairs, K.individual, K.pair, vocab, num.iterations, alpha, eta, xi)

Arguments

contexts
The set of textual descriptions (i.e., documents) for individual entities in LDA format (see lda.collapsed.gibbs.sampler for details).
pair.contexts
A set of textual descriptions for pairs of entities, also in LDA format.
pairs
Labelings as to which pair each element of pair.contexts refer to. This parameter should be an integer matrix with two columns and the same number of rows as pair.contexts. The two elements in each row of pair
K.individual
A scalar integer representing the number of topics for the individual entities.
K.pair
A scalar integer representing the number of topics for entity pairs.
vocab
A character vector specifying the vocabulary words associated with the word indices used in contexts and pair.contexts.
num.iterations
The number of sweeps of Gibbs sampling over the entire corpus to make.
alpha
The scalar value of the Dirichlet hyperparameter for topic proportions.
eta
The scalar value of the Dirichlet hyperparamater for topic multinomials.
xi
The scalar value of the Dirichlet hyperparamater for source proportions.

Value

  • A fitted model as a list with the same components as returned by lda.collapsed.gibbs.sampler with the following additional components:
  • source_assignmentsA list of length(pair.contexts) whose elements source_assignments[[i]] are of the same length as pair.contexts[[i]] where each entry is either 0 if the sampler assigned the word to the first entity, 1 if the sampler assigned the word to the second entity, or 2 if the sampler assigned the word to the relationship between the two.
  • x
  • document_source_sumsA matrix with three columns and length(pair.contexts) rows where each row indicates how many words were assigned to the first entity of the pair, the second entity of the pair, and the relationship between the two, respectively.
  • document_sumsSemantically similar to the entry in lda.collapsed.gibbs.sampler, except that it is a list whose first length(contexts) correspond to the columns of the entry in lda.collapsed.gibbs.sampler for the individual contexts, and the remaining length(pair.contexts) entries correspond to the columns for the pair contexts.
  • topicsLike the entry in lda.collapsed.gibbs.sampler, except that it contains the concatenation of the K.individual topics and the K.pair topics.

Details

The NUBBI model is a switching model wherein the description of each entity-pair can be ascribed to either the first entity of the pair, the second entity of the pair, or their relationship. The NUBBI model posits a latent space (i.e., topic model) over the individual entities, and a different latent space over entity relationships.

The collapsed Gibbs sampler used in this model is different than the variational inference method proposed in the paper and is highly experimental.

References

Chang, Jonathan and Boyd-Graber, Jordan and Blei, David M. Connections between the lines: Augmenting social networks with text. KDD, 2009.

See Also

See lda.collapsed.gibbs.sampler for a description of the input formats and similar models. rtm.collapsed.gibbs.sampler is a different kind of model for document networks.

Examples

Run this code
## See demo.

demo(nubbi)

Run the code above in your browser using DataLab