Stats for computing densities and CDFs + intervals from samples for use with
geom_slabinterval()
. Useful for creating eye plots, half-eye plots,
CCDF bar plots etc.
stat_sample_slabinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
slab_type = c("pdf", "cdf", "ccdf", "histogram"),
adjust = 1,
trim = TRUE,
breaks = "Sturges",
outline_bars = FALSE,
orientation = NA,
limits = NULL,
n = 501,
interval_function = NULL,
interval_args = list(),
point_interval = median_qi,
.width = c(0.66, 0.95),
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE
)stat_halfeye(...)
stat_eye(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
show.legend = c(size = FALSE),
inherit.aes = TRUE
)
stat_ccdfinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
slab_type = "ccdf",
normalize = "none",
show.legend = c(size = FALSE),
inherit.aes = TRUE
)
stat_cdfinterval(..., slab_type = "cdf", normalize = "none")
stat_gradientinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
show.legend = c(size = FALSE, slab_alpha = FALSE),
inherit.aes = TRUE
)
stat_histinterval(..., slab_type = "histogram")
stat_slab(
mapping = NULL,
data = NULL,
geom = "slab",
position = "identity",
...,
show.legend = NA,
inherit.aes = TRUE
)
The data to be displayed in this layer. There are three options:
If NULL
, the default, the data is inherited from the plot
data as specified in the call to ggplot()
.
A data.frame
, or other object, will override the plot
data. All objects will be fortified to produce a data frame. See
fortify()
for which variables will be created.
A function
will be called with a single argument,
the plot data. The return value must be a data.frame
, and
will be used as the layer data. A function
can be created
from a formula
(e.g. ~ head(.x, 10)
).
Use to override the default connection between
stat_slabinterval
and geom_slabinterval()
Position adjustment, either as a string, or the result of a call to a position adjustment function.
Other arguments passed to layer()
. They may also be arguments to the paired geom
(e.g., geom_pointinterval()
)
The type of slab function to calculate: probability density (or mass) function ("pdf"
),
cumulative distribution function ("cdf"
), complementary CDF ("ccdf"
), or histogram ("histogram"
.
If slab_type
is "pdf"
, bandwidth for the density estimator is adjusted by multiplying it
by this value. See density()
for more information.
If slab_type
is "pdf"
, should the density estimate be trimmed to the range of the
input data? Default TRUE
.
If slab_type
is "histogram"
, the breaks
parameter that is passed to
hist()
to determine where to put breaks in the histogram.
If slab_type
is "histogram"
, outline_bars
determines if outlines in between
the bars are drawn when the slab_color
aesthetic is used. If FALSE
(the default), the outline
is drawn only along the tops of the bars; if TRUE
, outlines in between bars are also drawn.
Whether this geom is drawn horizontally ("horizontal"
) or
vertically ("vertical"
). The default, NA
, automatically detects the orientation based on how the
aesthetics are assigned, and should generally do an okay job at this. When horizontal (resp. vertical),
the geom uses the y
(resp. x
) aesthetic to identify different groups, then for each group uses
the x
(resp. y
) aesthetic and the thickness
aesthetic to draw a function as an slab, and draws
points and intervals horizontally (resp. vertically) using the xmin
, x
, and xmax
(resp.
ymin
, y
, and ymax
) aesthetics. For compatibility with the base
ggplot naming scheme for orientation
, "x"
can be used as an alias for "vertical"
and "y"
as an alias for
"horizontal"
(tidybayes had an orientation
parameter before ggplot did, and I think the tidybayes naming
scheme is more intuitive: "x"
and "y"
are not orientations and their mapping to orientations is, in my
opinion, backwards; but the base ggplot naming scheme is allowed for compatibility).
Limits for slab_function
, as a vector of length two. These limits are combined with those
computed by the limits_function
as well as the limits defined by the scales of the plot to determine the
limits used to draw the slab functions: these limits specify the maximal limits; i.e., if specified, the limits
will not be wider than these (but may be narrower). Use NA
to leave a limit alone; e.g.
limits = c(0, NA)
will ensure that the lower limit does not go below 0.
Number of points at which to evaluate slab_function
Custom function for generating intervals (for most common use cases the point_interval
argument will be easier to use). This function takes a data frame of aesthetics and a .width
parameter (a vector
of interval widths), and returns a data frame with
columns .width
(from the .width
vector), .value
(point summary) and .lower
and .upper
(endpoints of the intervals, given the .width
). Output will be converted to the appropriate x
- or
y
-based aesthetics depending on the value of orientation
. If interval_function
is NULL
,
point_interval
is used instead.
Additional arguments passed to interval_function
or point_interval
.
A function from the point_interval()
family (e.g., median_qi
,
mean_qi
, etc). This function should take in a vector of value, and should obey the
.width
and .simple_names
parameters of point_interval()
functions, such that when given
a vector with .simple_names = TRUE
should return a data frame with variables .value
, .lower
,
.upper
, and .width
. Output will be converted to the appropriate x
- or y
-based aesthetics
depending on the value of orientation
. See the point_interval()
family of functions for
more information.
The .width
argument passed to interval_function
or point_interval
.
If FALSE
, the default, missing values are removed with a warning. If TRUE
, missing
values are silently removed.
Should this layer be included in the legends? Default is c(size = FALSE)
, unlike most geoms,
to match its common use cases. FALSE
hides all legends, TRUE
shows all legends, and NA
shows only
those that are mapped (the default for most geoms).
If FALSE
, overrides the default aesthetics,
rather than combining with them. This is most useful for helper functions
that define both data and aesthetics and shouldn't inherit behaviour from
the default plot specification, e.g. borders()
.
How to normalize heights of functions input to the thickness
aesthetic. If "all"
(the default), normalize so that the maximum height across all data is 1
; if "panels"
, normalize within
panels so that the maximum height in each panel is 1
; if "xy"
, normalize within
the x/y axis opposite the orientation
of this geom so that the maximum height at each value of the
opposite axis is 1
; if "groups"
, normalize within values of the opposite axis and within
groups so that the maximum height in each group is 1
; if "none"
, values are taken as is with no
normalization (this should probably only be used with functions whose values are in [0,1], such as CDFs).
A ggplot2::Stat representing a slab or combined slab+interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the stat()
or after_stat()
functions:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
level
: For intervals, the interval width as an ordered factor.
f
: For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
.
pdf
: For slabs, the probability density function.
cdf
: For slabs, the cumulative distribution function.
n
: For slabs, the number of data points summarized into that slab.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
) except for stat_dist_
geometries (which use only one of x
or y
at a time along with the dist
aesthetic).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
) except for stat_dist_
geometries (which use only one of x
or y
at a time along with the dist
aesthetic).
In addition, in their default configuration (paired with geom_slabinterval()
) the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: (or fill_ramp
) A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
size
: Width of the outline around the slab (if visible). Also determines the width of
the line used to draw the interval and the size of the point, but raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the slab_size
,
interval_size
, or point_size
aesthetics (below) to set sub-geometry line widths separately
(note that when size is set directly using the override aesthetics, interval and point
sizes are not affected by interval_size_domain
, interval_size_range
, and fatten_point
).
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color/line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_size
: Override for size
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color/line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_size
: Override for size
: the line width of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color/line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the scales documentation.
Learn more about basic ggplot aesthetics in vignette("ggplot2-specs")
.
A highly configurable stat for generating a variety of plots that combine a "slab" that summarizes a sample plus an interval. Several "shortcut" stats are provided which combine multiple options to create useful geoms, particularly eye plots (a combination of a violin plot and interval), half-eye plots (a density plus interval), and CCDF bar plots (a complementary CDF plus interval). These can be handy for visualizing posterior distributions in Bayesian inference, amongst other things.
The shortcut stat names follow the pattern stat_[name]
.
Stats include:
stat_eye
: Eye plots (violin + interval)
stat_halfeye
: Half-eye plots (density + interval)
stat_ccdfinterval
: CCDF bar plots (CCDF + interval)
stat_cdfinterval
: CDF bar plots (CDF + interval)
stat_gradientinterval
: Density gradient + interval plots
stat_histinterval
: Histogram + interval plots
stat_pointinterval
: Point + interval plots
stat_interval
: Interval plots
See geom_slabinterval()
for more information on the geom these stats
use by default and some of the options they have. See stat_dist_slabinterval()
for the versions of these stats that can be used on analytical distributions.
See vignette("slabinterval")
for a variety of examples of use.
# NOT RUN {
library(dplyr)
library(ggplot2)
# consider the following example data:
set.seed(1234)
df = data.frame(
group = c("a", "b", "c", "c", "c"),
value = rnorm(2500, mean = c(5, 7, 9, 9, 9), sd = c(1, 1.5, 1, 1, 1))
)
# here are vertical eyes:
df %>%
ggplot(aes(x = group, y = value)) +
stat_eye()
# note the sample size is not automatically incorporated into the
# area of the densities in case one wishes to plot densities against
# a reference (e.g. a prior generated by a stat_dist_... function).
# But you may wish to account for sample size if using these geoms
# for something other than visualizing posteriors; in which case
# you can use stat(f*n):
df %>%
ggplot(aes(x = group, y = value)) +
stat_eye(aes(thickness = stat(pdf*n)))
# see vignette("slabinterval") for many more examples.
# }
Run the code above in your browser using DataLab