Learn R Programming

preText (version 0.6.2)

preText_test: preText Test

Description

calculates preText scores for each preprocessing specification.

Usage

preText_test(distance_matrices, choices, labels = NULL,
  baseline_index = 128, text_size = 1, num_comparisons = 50,
  parallel = FALSE, cores = 1, verbose = TRUE)

Arguments

distance_matrices

A list of document distance matrices generated by the `scaling_comparison()` function and returned in the `$distance_matrices` field.

choices

A dataframe indicating whether a preprocessing step was applied or not, for each preprocessing step. This is generated by the `factorial_preprocessing()` function and returned in the `$choices` field.

labels

Optional argument giving names for each preprocessing step. This is generated by the `factorial_preprocessing()` function and returned in the `$labels` field.

baseline_index

The index of the baseline distance matrix against which we are comparing. Defaults to 128, which is the most minimal preprocessing for our current implementation.

text_size

The `cex` for text in dot plot generated by function.

num_comparisons

The number of ranks to use in calculating average difference. Defaults to 50.

parallel

Logical indicating whether factorial prerpocessing should be performed in parallel. Defaults to FALSE.

cores

Defaults to 1, can be set to any number less than or equal to the number of cores on one's computer.

verbose

Logical indicating whether more information should be printed to the screen to let the user know about progress. Defaults to TRUE.

Value

A result list object.

Examples

Run this code
# NOT RUN {
# *** This function is used automatically inside of the preText() function.
# load the package
library(preText)
# load in the data
data("UK_Manifestos")
# preprocess data
preprocessed_documents <- factorial_preprocessing(
    UK_Manifestos,
    use_ngrams = TRUE,
    infrequent_term_threshold = 0.02,
    verbose = TRUE)
# scale documents
scaling_results <- scaling_comparison(preprocessed_documents$dfm_list,
                                      dimensions = 2,
                                      distance_method = "cosine",
                                      verbose = TRUE)
# run preText test
preText_test_results <-preText_test(scaling_results$distance_matrices,
                                    choices = preprocessed_documents$choices,
                                    labels = preprocessed_documents$labels,
                                    baseline_index = 128,
                                    text_size = 1,
                                    num_comparisons = 50,
                                    parallel = FALSE,
                                    cores = 1,
                                    verbose = TRUE)
# }

Run the code above in your browser using DataLab