Learn R Programming

preText (version 0.6.2)

dfm_scaling_test: Comparison of dfms using N-dimensional scaling, with a test for difference from the mean dfm scaled position.

Description

Scale each dfm into a N-d space and test for outliers.

Usage

dfm_scaling_test(scaling_results, labels, dimensions = 2,
  distance_method = "cosine", method = c("distances", "positions"),
  return_positions = FALSE)

Arguments

scaling_results

A list object produced by the `scaling_comparison()` function.

labels

A character vector with labels for each dfm. This can be extracted from the `$labels` field of the output from the `factorial_preprocessing()` function.

dimensions

The number of dimensions to be used by the multidimensional scaling algorithm. Defaults to 2.

distance_method

The method that should be used for calculating distances between dfms. Defaults to "cosine".

method

Should the raw distances or scaled document positions be used for scaling? Can be one of c("distances","positions"), defaults to "distances".

return_positions

Logical indicating whether dfm positions should be returned as a data.frame. Defaults to FALSE

Value

A result list object, or a plot, or both.

Examples

Run this code
# NOT RUN {
# *** This function is used automatically inside of the preText() function.
# load the package
library(preText)
# load in the data
data("UK_Manifestos")
# preprocess data
preprocessed_documents <- factorial_preprocessing(
    UK_Manifestos,
    use_ngrams = TRUE,
    infrequent_term_threshold = 0.02,
    verbose = TRUE)
# scale documents
scaling_results <- scaling_comparison(preprocessed_documents$dfm_list,
                                      dimensions = 2,
                                      distance_method = "cosine",
                                      verbose = TRUE)
# now perform the scaling test
dfm_scaling_test(scaling_results,
                 labels = preprocessed_documents$labels)
# }

Run the code above in your browser using DataLab