Generates a contingency table from user-specified document covariates and a document term matrix.
contingency_table(metadata, document_term_matrix, vocabulary = NULL,
variables_to_use = NULL, threshold = 0, force_dense = FALSE)
A data.frame containing document covariates.
A documents x vocabulary matrix with counts of unique words in each document. Can be a dense or sparse matrix.
A character vector corresponding to the columns of the document word matrix. If NULL, the column names of doc_word_matrix will be used. Defaults to NULL.
Defaults to NULL in which case all columns of the metadata data frame will be used. Otherwise can be specified as a vector of column indexes or column names.
Defaults to 0, the number of times a unique value of a variable must appear in order to be included in the returned list object. Allows the user to ignore very infrequent values.
Forces the contingency table returned to be a dense matrix. The function will automatically generate a sparse matrix contingency table if the contingency table would have more than 100,000 entries.
A contingency table.