Bar charts for categorical data with statistical details included in the plot as a subtitle.
ggbarstats(
data,
main,
condition,
counts = NULL,
ratio = NULL,
paired = FALSE,
results.subtitle = TRUE,
sample.size.label = TRUE,
label = "percentage",
perc.k = 0,
label.args = list(alpha = 1, fill = "white"),
bf.message = TRUE,
sampling.plan = "indepMulti",
fixed.margin = "rows",
prior.concentration = 1,
title = NULL,
subtitle = NULL,
caption = NULL,
conf.level = 0.95,
nboot = 100,
legend.title = NULL,
xlab = NULL,
ylab = NULL,
k = 2,
proportion.test = TRUE,
ggtheme = ggplot2::theme_bw(),
ggstatsplot.layer = TRUE,
package = "RColorBrewer",
palette = "Dark2",
ggplot.component = NULL,
output = "plot",
messages = TRUE,
x = NULL,
y = NULL,
...
)
A dataframe (or a tibble) from which variables specified are to be taken. A matrix or tables will not be accepted.
The variable to use as the rows in the contingency table.
The variable to use as the columns in the contingency
table. Default is NULL
. If NULL
, one-sample proportion test (a goodness
of fit test) will be run for the x
variable. Otherwise an appropriate
association test will be run. This argument can not be NULL
for
ggbarstats
function.
A string naming a variable in data containing counts, or NULL
if each row represents a single observation (Default).
A vector of proportions: the expected proportions for the
proportion test (should sum to 1). Default is NULL
, which means the null
is equal theoretical proportions across the levels of the nominal variable.
This means if there are two levels this will be ratio = c(0.5,0.5)
or if
there are four levels this will be ratio = c(0.25,0.25,0.25,0.25)
, etc.
Logical indicating whether data came from a within-subjects or
repeated measures design study (Default: FALSE
). If TRUE
, McNemar's
test subtitle will be returned. If FALSE
, Pearson's chi-square test will
be returned.
Decides whether the results of statistical tests are
to be displayed as a subtitle (Default: TRUE
). If set to FALSE
, only
the plot will be returned.
Logical that decides whether sample size information
should be displayed for each level of the grouping variable y
(Default: TRUE
).
Character decides what information needs to be
displayed on the label in each pie slice. Possible options are
"percentage"
(default), "counts"
, "both"
.
Numeric that decides number of decimal places for percentage
labels (Default: 0
).
Additional aesthetic arguments that will be passed to
geom_label
.
Logical that decides whether to display Bayes Factor in
favor of the null hypothesis. This argument is relevant only for
parametric test (Default: TRUE
).
Character describing the sampling plan. Possible options
are "indepMulti"
(independent multinomial; default), "poisson"
,
"jointMulti"
(joint multinomial), "hypergeom"
(hypergeometric). For
more, see ?BayesFactor::contingencyTableBF()
.
For the independent multinomial sampling plan, which
margin is fixed ("rows"
or "cols"
). Defaults to "rows"
.
Specifies the prior concentration parameter, set
to 1
by default. It indexes the expected deviation from the null
hypothesis under the alternative, and corresponds to Gunel and Dickey's
(1974) "a"
parameter.
The text for the plot title.
The text for the plot subtitle. Will work only if
results.subtitle = FALSE
.
The text for the plot caption.
Scalar between 0 and 1. If unspecified, the defaults return
95%
lower and upper confidence intervals (0.95
).
Number of bootstrap samples for computing confidence interval
for the effect size (Default: 100
).
Title text for the legend.
Custom text for the x
axis label (Default: NULL
, which
will cause the x
axis label to be the x
variable).
Custom text for the y
axis label (Default: NULL
).
Number of digits after decimal point (should be an integer)
(Default: k = 2
).
Decides whether proportion test for main
variable is
to be carried out for each level of y
(Default: TRUE
).
A function, ggplot2
theme name. Default value is
ggplot2::theme_bw()
. Any of the ggplot2
themes, or themes from
extension packages are allowed (e.g., ggthemes::theme_fivethirtyeight()
,
hrbrthemes::theme_ipsum_ps()
, etc.).
Logical that decides whether theme_ggstatsplot
theme elements are to be displayed along with the selected ggtheme
(Default: TRUE
). theme_ggstatsplot
is an opinionated theme layer that
override some aspects of the selected ggtheme
.
Name of package from which the palette is desired as string or symbol.
Name of palette as string or symbol.
A ggplot
component to be added to the plot prepared
by ggstatsplot
. This argument is primarily helpful for grouped_
variant
of the current function. Default is NULL
. The argument should be entered
as a function.
Character that describes what is to be returned: can be
"plot"
(default) or "subtitle"
or "caption"
. Setting this to
"subtitle"
will return the expression containing statistical results. If
you have set results.subtitle = FALSE
, then this will return a NULL
.
Setting this to "caption"
will return the expression containing details
about Bayes Factor analysis, but valid only when type = "parametric"
and
bf.message = TRUE
, otherwise this will return a NULL
. For functions
ggpiestats
and ggbarstats
, setting output = "proptest"
will return a
dataframe containing results from proportion tests.
Decides whether messages references, notes, and warnings are
to be displayed (Default: TRUE
).
The variable to use as the rows in the contingency table.
The variable to use as the columns in the contingency
table. Default is NULL
. If NULL
, one-sample proportion test (a goodness
of fit test) will be run for the x
variable. Otherwise an appropriate
association test will be run. This argument can not be NULL
for
ggbarstats
function.
Currently ignored.
Unlike a number of statistical softwares, ggstatsplot
doesn't
provide the option for Yates' correction for the Pearson's chi-squared
statistic. This is due to compelling amount of Monte-Carlo simulation
research which suggests that the Yates' correction is overly conservative,
even in small sample sizes. As such it is recommended that it should not
ever be applied in practice (Camilli & Hopkins, 1978, 1979; Feinberg, 1980;
Larntz, 1978; Thompson, 1988).
For more about how the effect size measures and their confidence intervals
are computed, see ?rcompanion::cohenG
, ?rcompanion::cramerV
, and
?rcompanion::cramerVFit
.
# NOT RUN {
# for reproducibility
set.seed(123)
# association test (or contingency table analysis)
ggstatsplot::ggbarstats(
data = mtcars,
x = vs,
y = cyl,
nboot = 10,
legend.title = "Engine"
)
# }
Run the code above in your browser using DataLab