A function to combine multiple document term matrices into a single aggregate document term matrix.
combine_document_term_matrices(document_term_matrix_list,
vocabulary_list = NULL, use_column_names_as_vocabularies = FALSE)
A list of document term matricies -- preferrably generated using generate_document_term_matrix(), each of which corresponds to a vocabulary in vocabulary_list.
A list of string vectors containing the vocabularies associated with each document term matrix. The j'th entry in each of these vectors should correspond to j'th column in the assoicated document term matrix. Defaults to NULL. If use_column_names_as_vocabularies = TRUE, then vocabularies will be extracted from document term matrices, otherwise these must be provided.
Deafults to FALSE, if TRUE then the function will attempt to extract vocabularies from the column names of each document term matrix.
An aggregate document term matrix with columns named for each word in the vocabulary and columns ordered from most frequently used to least frequently used terms.