Learn R Programming

quanteda (version 0.9.6-1)

ndoc: get the number of documents or features

Description

ndoc returns the number of documents or features in a quanteda object, which can be a corpus, dfm, or tokenized texts.

nfeature is an alias for ntype when applied to dfm objects. For a corpus or set of texts, "features" are only defined through tokenization, so you need to use ntoken to count these.

Usage

ndoc(x)

## S3 method for class 'corpus': ndoc(x)

## S3 method for class 'dfm': ndoc(x)

nfeature(x)

## S3 method for class 'corpus': nfeature(x)

## S3 method for class 'dfm': nfeature(x)

Arguments

x
a corpus or dfm object

Value

  • an integer (count) of the number of documents or features in the corpus or dfm

Examples

Run this code
ndoc(subset(inaugCorpus, Year>1980))
ndoc(dfm(subset(inaugCorpus, Year>1980), verbose=FALSE))
nfeature(dfm(inaugCorpus))
nfeature(trim(dfm(inaugCorpus), minDoc=5, minCount=10))

Run the code above in your browser using DataLab