Learn R Programming

quanteda (version 1.5.2)

textplot_xray: Plot the dispersion of key word(s)

Description

Plots a dispersion or "x-ray" plot of selected word pattern(s) across one or more texts. The format of the plot depends on the number of kwic class objects passed: if there is only one document, keywords are plotted one below the other. If there are multiple documents the documents are plotted one below the other, with keywords shown side-by-side. Given that this returns a ggplot2 object, you can modify the plot by adding ggplot2 layers (see example).

Usage

textplot_xray(..., scale = c("absolute", "relative"), sort = FALSE)

Arguments

...

any number of kwic class objects

scale

whether to scale the token index axis by absolute position of the token in the document or by relative position. Defaults are absolute for single document and relative for multiple documents.

sort

whether to sort the rows of a multiple document plot by document name

Value

a ggplot2 object

Known Issues

These are known issues on which we are working to solve in future versions:

  • textplot_xray() will not display the patterns correctly when these are multi-token sequences.

  • For dictionaries with keys that have overlapping value matches to tokens in the text, only the first match will be used in the plot. The way around this is to produce one kwic per dictionary key, and send them as a list to textplot_xray.

Examples

Run this code
# NOT RUN {
corp <- corpus_subset(data_corpus_inaugural, Year > 1970)
# compare multiple documents
textplot_xray(kwic(corp, pattern = "american"))
textplot_xray(kwic(corp, pattern = "american"), scale = "absolute")

# compare multiple terms across multiple documents
textplot_xray(kwic(corp, pattern = "america*"), 
              kwic(corp, pattern = "people"))

# how to modify the ggplot with different options
library(ggplot2)
tplot <- textplot_xray(kwic(corp, pattern = "american"), 
                   kwic(corp, pattern = "people"))
tplot + aes(color = keyword) + scale_color_manual(values = c('red', 'blue'))

# adjust the names of the document names
docnames(corp) <- apply(docvars(corp, c("Year", "President")), 1, paste, collapse = ", ")
textplot_xray(kwic(corp, pattern = "america*"), 
              kwic(corp, pattern = "people"))
# }

Run the code above in your browser using DataLab