term.day.dist: Calculate statistics for term occurence across days
Description
Calculate statistics for term occurence across days
Usage
term.day.dist(dtm, meta = NULL, date.var = "date")
Arguments
dtm
A quanteda dfm. Alternatively, a DocumentTermMatrix from the tm package can be used, but then the meta parameter needs to be specified manually
meta
If dtm is a quanteda dfm, docvars(meta) is used by default (meta is NULL) to obtain the meta data. Otherwise, the meta data.frame has to be given by the user, with the rows of the meta data.frame matching the rows of the dtm (i.e. each row is a document)
date.var
The name of the meta column specifying the document date. default is "date". The values should be of type POSIXlt or POSIXct
Value
A data.frame with statistics for each term.
freq: The number of times a term occurred
doc.freq: The number of documents in which a term occured
days.n: The number of days on which a term occured
days.pct: The percentage of days on which a term occured
days.entropy: The entropy of the distribution of term frequency across days
days.entropy.norm: The normalized days.entropy, where 1 is a discrete uniform distribution