Learn R Programming

quanteda (version 0.9.7-17)

ndoc: get the number of documents or features

Description

ndoc returns the number of documents or features in a quanteda object, which can be a corpus, dfm, or tokenized texts.

nfeature is an alias for ntype when applied to dfm objects. For a corpus or set of texts, "features" are only defined through tokenization, so you need to use ntoken to count these.

Usage

ndoc(x)
"ndoc"(x)
"ndoc"(x)
nfeature(x)
"nfeature"(x)
"nfeature"(x)

Arguments

x
a corpus or dfm object

Value

an integer (count) of the number of documents or features in the corpus or dfm

Examples

Run this code
ndoc(subset(inaugCorpus, Year>1980))
ndoc(dfm(subset(inaugCorpus, Year>1980), verbose=FALSE))
nfeature(dfm(inaugCorpus))
nfeature(trim(dfm(inaugCorpus), minDoc=5, minCount=10))

Run the code above in your browser using DataLab