A function that generates plots similar to those in Monroe et al. 'Fightin Words...'.
fightin_words_plot(feature_selection_object, title = "",
positive_category = "Category 1", negative_category = "Category 2",
xlab = "term count", display_top_words = 20,
display_terms_next_to_points = FALSE, size_terms_by_frequency = FALSE,
right_margin = 20, max_terms_to_display = 1e+05,
use_subsumed_ngrams = FALSE, limits = NULL,
clean_publication_plots = FALSE, rank_by_log_odds = FALSE)
A list object generated by the feature_selection function.
A user supplied title for the plot. Defaults to "", in which case a blank title is displayed.
The name the user wishes to give to the first category specified when using the feature_selection function. Defaults to "Category 1".
The name the user wishes to give to the second category specified when using the feature_selection function. Defaults to "Category 2".
Defaults to 'Term Frequency', but can be modified as necessary.
Defaults to 20 and controls the number of top terms for each category displayed in the plot.
Optional argument, defaults to FALSE. If TRUE, then terms are displayed next to the points corresponding to them on the plot. Can get messy.
Optional argument, defualts to FALSE. If TRUE, then when top terms are printed, they are sized in proportion to their frequency.
Parameter controling how much space should be reserved for the right margin in the plot (for displaying top terms). Defaults to 20 but can be adjusted depending on the length of terms.
Defaults to 100,000. Used to prevent overloading the plotting device with very large vocabularies. Can be set by the user.
Logical indicating whether subsumed ngrams should be used when displaying top terms. This will only work if the user has selected subsume_ngrams = TRUE in the feature_selection() function (and is using a vocabulary contianing overlapping n-grams).
An optional numeric vector of length two where the first number is the upper x limit (term count) and the second term is the absolute value of the maximum z-score to display (the y limit). Defaults to NULL, in which case the optimal values are automatically determined. Can be useful for comparison between plots.
Logical to remove labels inside of plot and color all dots uniformly. Defaults to FALSE.
Only applicable for the "informed_Dirichlet" method. Defaults to FALSE. If TRUE, then terms are ranked by log odds instead of z-score.
A Fightin' Words plot