Learn R Programming

quanteda (version 0.9.8.5)

sample: Randomly sample documents or features

Description

Takes a random sample or documents or features of the specified size from a corpus or document-feature matrix, with or without replacement

Usage

sample(x, size, replace = FALSE, prob = NULL, ...)
"sample"(x, size, replace = FALSE, prob = NULL, ...)
"sample"(x, size = ndoc(x), replace = FALSE, prob = NULL, ...)
"sample"(x, size = ndoc(x), replace = FALSE, prob = NULL, what = c("documents", "features"), ...)

Arguments

x
a corpus or dfm object whose documents or features will be sampled
size
a positive number, the number of documents to select
replace
Should sampling be with replacement?
prob
A vector of probability weights for obtaining the elements of the vector being sampled.
...
unused sample, which is not defined as a generic method in the base package.
what
dimension (of a dfm) to sample: can be documents or features

Value

A corpus object with number of documents equal to size, drawn from the corpus x. The returned corpus object will contain all of the meta-data of the original corpus, and the same document variables for the documents selected.A dfm object with number of documents equal to size, drawn from the corpus x. The returned corpus object will contain all of the meta-data of the original corpus, and the same document variables for the documents selected.

See Also

sample

Examples

Run this code
# sampling from a corpus
summary(sample(inaugCorpus, 5)) 
summary(sample(inaugCorpus, 10, replace=TRUE))
# sampling from a dfm
myDfm <- dfm(inaugTexts[1:10], verbose = FALSE)
sample(myDfm)[, 1:10]
sample(myDfm, replace = TRUE)[, 1:10]
sample(myDfm, what = "features")[1:10, ]

Run the code above in your browser using DataLab