Learn R Programming

stmCorrViz (version 1.3)

findThreshold: Find appropriate threshold range

Description

This function performs a grid search over potential clustering thresholds to identify a valid range, and inspect the varying levels of aggregation within it.

Usage

findThreshold(mod, documents_raw=NULL, documents_matrix=NULL, range_min=.05, range_max=5, step=.05)

Arguments

mod
A fitted STM object from stm.
documents_raw
The raw documents used to generate the STM model. A character vector where each entry is the full text of a document.
documents_matrix
Document-term matrix representation of the raw documents, as generated by the prepDocuments function.
range_min
Lower bound of the range to be searched.
range_max
Upper bound of the range to be searched.
step
Step size for the grid search.

Value

A data frame containing the following columns:
  1. threshold: Threshold value.
  2. valid: Binary value; 1 if clustering is successful using given threshold; 0 if not.
  3. juncture_points: Number of juncture points in the resulting clustering tree; -1 if run is unsuccessful. Lower threshold values yield a higher number of juncture points, corresponding to more binary splits and deeper trees. Higher threshold values produce fewer juncture points, corresponding to trees that have significant breadth rather than depth.

See Also

stmCorrViz