viztest: Calculate Correspondence Between Pairwise Test and CI Overlaps

Description

viztest() does a grid search over range_levels to find the confidence level(s) such that the (non-)overlaps in confidence intervals corresponds as closely as possible with the results of pairwise tests. To the extent that a level is found that accounts for all pairwise tests, confidence bounds at this level can be added to coefficient or marginal effects plots to enable readers to reliably identify estimates that are statistically different from each other.

Usage

viztest(
  obj,
  test_level = 0.05,
  range_levels = c(0.25, 0.99),
  level_increment = 0.01,
  adjust = c("none", "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr"),
  cifun = c("quantile", "hdi"),
  include_intercept = FALSE,
  include_zero = TRUE,
  sig_diffs = NULL,
  ...
)

Value

A list (of class "viztest") with the following elements:

tab: a data frame with results from the grid search. The data frame has four variables: level - is the confidence level used in the grid search; psame - the proportion of (non-)overlaps that match the normal theory tests; pdiff - the proportion of pairwise tests that are statistically significant; easy - the ease with which the comparisons are made.
pw_tests: A logical vector indicating which tests are significantly significant.
ci_tests: A logical vector indicating whether the confidence intervals are disjoint (TRUE) or overlap (FALSE).
combs: The pairwise combinations of stimuli used in the test. Note, the stimuli are reordered from largest to smallest, so the numbers do not represent the position in the original ordering.
param_names: A vector of the names of the parameters reordered by size - largest to smallest.
L: The lower confidence bounds from the grid search.
U: The upper confidence bounds from the grid search.
est: A data frame with the variables vbl - the parameter name; est - the parameter estimate; se - the parameter standard error.

Arguments

obj: A model object (or any object) where coef() and vcov() return estimates of coefficients and sampling variability.
test_level: The type I error rate of the pairwise tests.
range_levels: The range of confidence levels to try.
level_increment: Step size of increase between the values of range_levels.
adjust: Multiplicity adjustment to use when calculating the p-values for normal theory pairwise tests.
cifun: For simulation results, the method used to calculate the confidence/credible interval either "quantile" (default) or "hdi" for highest density region.
include_intercept: Logical indicating whether the intercept should be included in the tests, defaults to FALSE.
include_zero: Should univariate tests at zero be included, defaults to TRUE.
sig_diffs: An optional vector of values identify whether each pair of values is statistically different (1) or not (0). See Details for more information on specifying this value; there is some added complexity here.
...: Other arguments, currently not implemented.

Details

The algorithm first calculates results of a set of pairwise tests. For objects with estimates and a variance-covariance matrix, normal theory tests are calculated. Optionally, these tests can be subjected to a multiplicity adjustment. In the case of simulation results, something akin to p-values are calculated by identifying the probability that one estimate is larger than another. To mimic the way we use p-values in the frequentist case, we subtract the probability of difference from 1, such that smaller values indicate more confidence in the difference. The algorithm then performs a grid search over range_levels at increments of level_increment. For each candidate level, the confidence intervals for all parameters are calculated. For each pair of estimates, it identifies whether the confidence intervals (or credible intervals if the input is a matrix of Bayesian simulation draws) overlaps. For each candidate level, it calculates the proportion of times where differences are significant/credible and confidence/credible intervals do not overlap or differences are not significant/credible and the intervals do overlap. The main idea is to find the level(s) such that the (non-)overlaps perfectly correspond with whether the differences are significant.

If such a level can be found, a visual inspection of confidence or credible intervals at that level will identify whether a pair of estimates is statistically different or not.

While most of the parameters are straightforward, the sig_diffs argument must be specified such that the stimuli are in order from highest to lowest. This is most easily done by using make_diff_template() to identify the appropriate order of the comparisons.

References

David A. Armstrong II and William Poirier. "Decoupling Visualization and Testing when Presenting Confidence Intervals" Political Analysis doi:10.1017/pan.2024.24.

Examples

Run this code

data(mtcars)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$hp <- scale(mtcars$hp)
mtcars$wt <- scale(mtcars$wt)
mod <- lm(qsec ~ hp + wt + cyl, data=mtcars)
viztest(mod)

Run the code above in your browser using DataLab