Learn R Programming

stmCorrViz (version 1.3)

stmCorrViz: Generate STM Correlation Tree

Description

This function generates an interactive, full-model HTML visualization of topic hierachies for a fitted STM model. The visualization highlights the correlations among topics, and can be used to view the model at differing levels of complexity. The function makes use of the D3.js visualization library. The visualization needs to be viewed in a compatible web browser.

Usage

stmCorrViz(mod, file_out, documents_raw=NULL, documents_matrix=NULL, title="STM Model", clustering_threshold=FALSE, search_options = list(range_min=.05, range_max=5, step=.05), labels_number=7, display=TRUE, verbose=FALSE)

Arguments

mod
A fitted STM object from stm.
file_out
Name of the output file that will be generated by the function. This should end with an HTML extension.
documents_raw
The raw documents used to generate the STM model. A character vector where each entry is the full text of a document.
documents_matrix
Document-term matrix representation of the raw documents, as generated by the prepDocuments function.
title
Root node label. This defaults to "STM Model".
clustering_threshold
A parameter specifying the level of aggregation in the hierarchical clustering routine for topics. Lower threshold values produce more binary splits and deeper trees, while higher threshold values produce more aggregation and trees that have significant breadth rather than depth. See below for more details.

If FALSE, a grid search is performed to find valid thresholds is performed using findThreshold. The valid clustering threshold resulting in a median level of tree complexity is chosen.

search_options
List specifying the grid search parameters to be used by findThreshold. Only necessary if clustering_threshold is FALSE.
labels_number
The number of top words used to label each node (topic or topical cluster) in the visualization.
display
Boolean. If set to TRUE, the visualization is launched in the system's default web browser upon function execution.
verbose
Boolean. If set to TRUE, displays function progress in the console during execution.

Details

This function generates a full-model, interactive, general-purpose hierarchical representation of an STM model. First a hierarchy of topics is created using hierarchical clustering as implemented in hclust. Then the hierarchy is written out to a JSON object using stmJSON. Finally D3.js is used to create an interactive visualization.

The visualization is built as a HTML page, and as such requires a web browser for inspection. The function does not return and object, but writes HTML output to disk.

The visualization takes the form of an indented tree. The leaves of the tree correspond to topics. The leaf nodes are grouped in topic clusters. This allows the model to be visualized at differing levels of aggregation. The function uses the D3.js library for visualization purpose. The visualization is largely built on top of Mike Bostock's Collapsible Indented Tree block. A nested JSON structure representing the hierarchical model is produced using the stmJSON function.

References

Bostock M, Vadim O, Jeffrey H. D3: Data-Driven Documents. Visualization and Computer Graphics, IEEE Transactions on 17.12 (2011): 2301-2309.

Margaret E. Roberts, Brandon M. Stewart and Dustin Tingley (2014). stm: R Package for Structural Topic Models.

See Also

stmJSON

Examples

Run this code
data(immigration_perceptions)

stmCorrViz(immigration_perceptions$model, "corrviz.html", 
  documents_raw=immigration_perceptions$raw_documents, 
  documents_matrix=immigration_perceptions$documents_matrix)

Run the code above in your browser using DataLab