Learn R Programming

lda (version 1.1)

cora: A subset of the Cora dataset of scientific documents.

Description

A collection of 2410 scientific documents in LDA format with links and titles from the Cora search engine.

Usage

data(cora.documents)
data(cora.vocab)
data(cora.cites)
data(cora.titles)

Arguments

format

cora.documents and cora.vocab comprise a corpus of 2410 documents conforming to the LDA format.

cora.titles is a character vector of titles for each document (i.e., each entry of cora.documents).

cora.cites is a list representing the citations between the documents in the collection (see related for format).

source

Automating the construction of internet protals with machine learning. McCallum et al. Information Retrieval. 2000.

See Also

lda.collapsed.gibbs.sampler for the format of the corpus.

rtm.collapsed.gibbs.sampler for the format of the citation links.

Examples

Run this code
data(cora.documents)
data(cora.vocab)
data(cora.links)
data(cora.titles)

Run the code above in your browser using DataLab