plot.STM: Functions for plotting STM objects

Description

Produces one of four types of plots for an STM object. The default option "summary" prints topic words with their corpus frequency. "labels" is for easy printing of tables of indicative words for each topic. "perspectives" depicts differences between two topics, content covariates or combinations. "hist" creates a histogram of the expected distribution of topic proportions across the documents.

Usage

# S3 method for STM
plot(
  x,
  type = c("summary", "labels", "perspectives", "hist"),
  n = NULL,
  topics = NULL,
  labeltype = c("prob", "frex", "lift", "score"),
  frexw = 0.5,
  main = NULL,
  xlim = NULL,
  ylim = NULL,
  xlab = NULL,
  family = "",
  width = 80,
  covarlevels = NULL,
  plabels = NULL,
  text.cex = 1,
  custom.labels = NULL,
  topic.names = NULL,
  ...
)

Arguments

x: Model output from stm.
type: Sets the desired type of plot. See details for more information.
n: Sets the number of words used to label each topic. In perspective plots it approximately sets the total number of words in the plot. The defaults are 3, 20 and 25 for summary, labels and perspectives respectively. n must be greater than or equal to 2
topics: Vector of topics to display. For plot perspectives this must be a vector of length one or two. For the other two types it defaults to all topics.
labeltype: Determines which option of "prob", "frex", "lift", "score" is used for choosing the most important words. See labelTopics for more detail. Passing an argument to custom.labels will override this. Note that this does not apply to perspectives type which always uses highest probability words.
frexw: If "frex" labeltype is used, this will be the frex weight.
main: Title to the plot
xlim: Range of the X-axis.
ylim: Range of the Y-axis.
xlab: Labels for the X-axis. For perspective plots, use plabels instead.
family: The Font family. Most of the time the user will not need to specify this but if using other character sets can be useful see par.
width: Sets the width in number of characters used for string wrapping in type "labels"
covarlevels: A vector of length one or length two which contains the levels of the content covariate to be used in perspective plots.
plabels: This option can be used to override the default labels in the perspective plot that appear along the x-axis. It should be a character vector of length two which has the left hand side label first.
text.cex: Controls the scaling constant on text size.
custom.labels: A vector of custom labels if labeltype is equal to "custom".
topic.names: A vector of custom topic names. Defaults to "Topic #: ".
...: Additional parameters passed to plotting functions.

Details

The function can produce three types of plots which summarize an STM object which is chosen by the argument type. summary produces a plot which displays the topics ordered by their expected frequency across the corpus. labels plots the top words selected according to the chosen criteria for each selected topics. perspectives plots two topic or topic-covariate combinations. Words are sized proportional to their use within the plotted topic-covariate combinations and oriented along the X-axis based on how much they favor one of the two configurations. If the words cluster on top of each other the user can either set the plot size to be larger or shrink the total number of words on the plot. The vertical configuration of the words is random and thus can be rerun to produce different results each time. Note that perspectives plots do not use any of the labeling options directly. hist plots a histogram of the MAP estimates of the document-topic loadings across all documents. The median is also denoted by a dashed red line.

References

Roberts, Margaret E., Brandon M. Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana Kushner Gadarian, Bethany Albertson, and David G. Rand. "Structural Topic Models for Open-Ended Survey Responses." American Journal of Political Science 58, no 4 (2014): 1064-1082.

Examples

Run this code

# \donttest{

#Examples with the Gadarian Data
plot(gadarianFit)
plot(gadarianFit,type="labels")
plot(gadarianFit, type="perspectives", topics=c(1,2))
plot(gadarianFit,type="hist")
# }