add_p: Adds p-values to summary tables

Description

Adds p-values to tables created by tbl_summary by comparing values across groups.

Usage

add_p(x, test = NULL, pvalue_fun = NULL, group = NULL,
  include = NULL, exclude = NULL)

Arguments

Object with class tbl_summary from the tbl_summary function

test

List of formulas specifying statistical tests to perform, e.g. list(all_continuous() ~ "t.test", all_categorical() ~ "fisher.test"). Options include

"t.test" for a t-test,
"wilcox.test" for a Wilcoxon rank-sum test,
"kruskal.test" for a Kruskal-Wallis rank-sum test,
"chisq.test" for a Chi-squared test of independence,
"fisher.test" for a Fisher's exact test,
"lme4" for a random intercept logistic regression model to account for clustered data, lme4::glmer(by ~ variable + (1 | group), family = binomial). The by argument must be binary for this option.

Tests default to "kruskal.test" for continuous variables, "chisq.test" for categorical variables with all expected cell counts >=5, and "fisher.test" for categorical variables with any expected cell count <5. A custom test function can be added for all or some variables. See below for an example.

pvalue_fun

Function to round and format p-values. Default is style_pvalue. The function must have a numeric vector input (the numeric, exact p-value), and return a string that is the rounded/formatted p-value (e.g. pvalue_fun = function(x) style_pvalue(x, digits = 2) or equivalently, purrr::partial(style_pvalue, digits = 2)).

group

Column name of an ID or grouping variable. The column can be used calculate p-values with correlated data (e.g. when the test argument is "lme4"). Default is NULL. If specified, the row associated with this variable is omitted from the summary table.

include

Names of variables to include in output.

exclude

Names of variables to exclude from output.

Value

A tbl_summary object

Setting Defaults

If you like to consistently use a different function to format p-values or estimates, you can set options in the script or in the user- or project-level startup file, '.Rprofile'. The default confidence level can also be set. Please note the default option for the estimate is the same as it is for tbl_regression().

options(gtsummary.pvalue_fun = new_function)

Example Output

Example 1

Example 2

Examples

Run this code

# NOT RUN {
add_p_ex1 <-
  trial %>%
  dplyr::select(age, grade, response, trt) %>%
  tbl_summary(by = trt) %>%
  add_p()

# }
# NOT RUN {
# Conduct a custom McNemar test for response,
# Function must return a named list(p = 0.05, test = "McNemar's test")
# Function names begins with 'add_p_test.' and ends with the alias
add_p_test.mcnemar <- function(data, variable, by, ...) {
  result <- list()
  result$p <- stats::mcnemar.test(data[[variable]], data[[by]])$p.value
  result$test <- "McNemar's test"
  result
}

add_p_ex2 <-
  trial[c("response", "trt")] %>%
  tbl_summary(by = trt) %>%
  add_p(test = vars(response) ~ "mcnemar")
# }

Run the code above in your browser using DataLab