Learn R Programming

ggdist (version 2.4.0)

stat_dist_slabinterval: Distribution + interval plots (eye plots, half-eye plots, CCDF barplots, etc) for analytical distributions (ggplot stat)

Description

Stats for computing distribution functions (densities or CDFs) + intervals for use with geom_slabinterval(). Uses the dist aesthetic to specify a distribution using objects from the distributional package, or using distribution names and arg1, ... arg9 aesthetics (or args as a list column) to specify distribution arguments. See Details.

Usage

stat_dist_slabinterval(
  mapping = NULL,
  data = NULL,
  geom = "slabinterval",
  position = "identity",
  ...,
  slab_type = c("pdf", "cdf", "ccdf"),
  p_limits = c(NA, NA),
  orientation = NA,
  limits = NULL,
  n = 501,
  .width = c(0.66, 0.95),
  show_slab = TRUE,
  show_interval = TRUE,
  na.rm = FALSE,
  show.legend = c(size = FALSE),
  inherit.aes = TRUE
)

stat_dist_halfeye(...)

stat_dist_eye(..., side = "both")

stat_dist_ccdfinterval( ..., slab_type = "ccdf", justification = 0.5, side = "topleft", normalize = "none" )

stat_dist_cdfinterval( ..., slab_type = "cdf", justification = 0.5, side = "topleft", normalize = "none" )

stat_dist_gradientinterval( mapping = NULL, data = NULL, geom = "slabinterval", position = "identity", ..., justification = 0.5, thickness = 1, show.legend = c(size = FALSE, slab_alpha = FALSE), inherit.aes = TRUE )

stat_dist_pointinterval(..., show_slab = FALSE)

stat_dist_interval( mapping = NULL, data = NULL, geom = "interval", position = "identity", ..., show_slab = FALSE, show_point = FALSE, show.legend = NA, inherit.aes = TRUE )

stat_dist_slab( mapping = NULL, data = NULL, geom = "slab", position = "identity", ..., show.legend = NA, inherit.aes = TRUE )

Arguments

mapping

Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE (the default), it is combined with the default mapping at the top level of the plot. You must supply mapping if there is no plot mapping.

data

The data to be displayed in this layer. There are three options:

If NULL, the default, the data is inherited from the plot data as specified in the call to ggplot().

A data.frame, or other object, will override the plot data. All objects will be fortified to produce a data frame. See fortify() for which variables will be created.

A function will be called with a single argument, the plot data. The return value must be a data.frame, and will be used as the layer data. A function can be created from a formula (e.g. ~ head(.x, 10)).

geom

Use to override the default connection between stat_slabinterval and geom_slabinterval()

position

Position adjustment, either as a string, or the result of a call to a position adjustment function.

...

Other arguments passed to layer(). They may also be arguments to the paired geom (e.g., geom_pointinterval())

slab_type

The type of slab function to calculate: probability density (or mass) function ("pdf"), cumulative distribution function ("cdf"), or complementary CDF ("ccdf").

p_limits

Probability limits (as a vector of size 2) used to determine the lower and upper limits of the slab. E.g., if this is c(.001, .999), then a slab is drawn for the distribution from the quantile at p = .001 to the quantile at p = .999. If the lower (respectively upper) limit is NA, then the lower (upper) limit will be the minimum (maximum) of the distribution's support if it is finite, and 0.001 (0.999) if it is not finite. E.g., if p_limits is c(NA, NA) on a gamma distribution the effective value of p_limits would be c(0, .999) since the gamma distribution is defined on (0, Inf); whereas on a normal distribution it would be equivalent to c(.001, .999) since the normal distribution is defined on (-Inf, Inf).

orientation

Whether this geom is drawn horizontally ("horizontal") or vertically ("vertical"). The default, NA, automatically detects the orientation based on how the aesthetics are assigned, and should generally do an okay job at this. When horizontal (resp. vertical), the geom uses the y (resp. x) aesthetic to identify different groups, then for each group uses the x (resp. y) aesthetic and the thickness aesthetic to draw a function as an slab, and draws points and intervals horizontally (resp. vertically) using the xmin, x, and xmax (resp. ymin, y, and ymax) aesthetics. For compatibility with the base ggplot naming scheme for orientation, "x" can be used as an alias for "vertical" and "y" as an alias for "horizontal" (tidybayes had an orientation parameter before ggplot did, and I think the tidybayes naming scheme is more intuitive: "x" and "y" are not orientations and their mapping to orientations is, in my opinion, backwards; but the base ggplot naming scheme is allowed for compatibility).

limits

Manually-specified limits for the slab, as a vector of length two. These limits are combined with those computed based on p_limits as well as the limits defined by the scales of the plot to determine the limits used to draw the slab functions: these limits specify the maximal limits; i.e., if specified, the limits will not be wider than these (but may be narrower).Use NA to leave a limit alone; e.g. limits = c(0, NA) will ensure that the lower limit does not go below 0, but let the upper limit be determined by either p_limits or the scale settings.

n

Number of points at which to evaluate slab_function

.width

The .width argument passed to interval_function or point_interval.

show_slab

Should the slab portion of the geom be drawn? Default TRUE.

show_interval

Should the interval portion of the geom be drawn? Default TRUE.

na.rm

If FALSE, the default, missing values are removed with a warning. If TRUE, missing values are silently removed.

show.legend

Should this layer be included in the legends? Default is c(size = FALSE), unlike most geoms, to match its common use cases. FALSE hides all legends, TRUE shows all legends, and NA shows only those that are mapped (the default for most geoms).

inherit.aes

If FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. borders().

side

Which side to draw the slab on. "topright", "top", and "right" are synonyms which cause the slab to be drawn on the top or the right depending on if orientation is "horizontal" or "vertical". "bottomleft", "bottom", and "left" are synonyms which cause the slab to be drawn on the bottom or the left depending on if orientation is "horizontal" or "vertical". "topleft" causes the slab to be drawn on the top or the left, and "bottomright" causes the slab to be drawn on the bottom or the right. "both" draws the slab mirrored on both sides (as in a violin plot).

justification

Justification of the interval relative to the slab, where 0 indicates bottom/left justification and 1 indicates top/right justification (depending on orientation). If justification is NULL (the default), then it is set automatically based on the value of side: when side is "top"/"right" justification is set to 0, when side is "bottom"/"left" justification is set to 1, and when side is "both" justification is set to 0.5.

normalize

How to normalize heights of functions input to the thickness aesthetic. If "all" (the default), normalize so that the maximum height across all data is 1; if "panels", normalize within panels so that the maximum height in each panel is 1; if "xy", normalize within the x/y axis opposite the orientation of this geom so that the maximum height at each value of the opposite axis is 1; if "groups", normalize within values of the opposite axis and within groups so that the maximum height in each group is 1; if "none", values are taken as is with no normalization (this should probably only be used with functions whose values are in [0,1], such as CDFs).

thickness

Override for the thickness aesthetic in geom_slabinterval(): the thickness of the slab at each x / y value of the slab (depending on orientation).

show_point

Should the point portion of the geom be drawn? Default TRUE.

Value

A ggplot2::Stat representing a slab or combined slab+interval geometry which can be added to a ggplot() object.

Aesthetics

These stats support the following aesthetics:

  • dist

  • args

  • arg1

  • arg2

  • arg3

  • arg4

  • arg5

  • arg6

  • arg7

  • arg8

  • arg9

  • x

  • y

  • datatype

  • thickness

  • size

  • group

In addition, in their default configuration (paired with geom_slabinterval()) the following aesthetics are supported by the underlying geom:

  • x

  • y

  • datatype

  • alpha

  • colour

  • colour_ramp

  • linetype

  • fill

  • shape

  • stroke

  • point_colour

  • point_fill

  • point_alpha

  • point_size

  • size

  • interval_colour

  • interval_alpha

  • interval_size

  • interval_linetype

  • slab_size

  • slab_colour

  • slab_fill

  • slab_alpha

  • slab_linetype

  • fill_ramp

  • ymin

  • ymax

  • xmin

  • xmax

  • width

  • height

  • thickness

  • group

See examples of some of these aesthetics in action in vignette("slabinterval"). Learn more about the sub-geom aesthetics (like interval_color) in the scales documentation. Learn more about basic ggplot aesthetics in vignette("ggplot2-specs").

Computed Variables

  • x or y: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it is x or y depends on orientation

  • xmin or ymin: For intervals, the lower end of the interval from the interval function.

  • xmax or ymax: For intervals, the upper end of the interval from the interval function.

  • f: For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined by slab_type.

  • pdf: For slabs, the probability density function.

  • cdf: For slabs, the cumulative distribution function.

Details

A highly configurable stat for generating a variety of plots that combine a "slab" that describes a distribution plus an interval. Several "shortcut" stats are provided which combine multiple options to create useful geoms, particularly eye plots (a combination of a violin plot and interval), half-eye plots (a density plus interval), and CCDF bar plots (a complementary CDF plus interval).

The shortcut stat names follow the pattern stat_dist_[name].

Stats include:

  • stat_dist_eye: Eye plots (violin + interval)

  • stat_dist_halfeye: Half-eye plots (density + interval)

  • stat_dist_ccdfinterval: CCDF bar plots (CCDF + interval)

  • stat_dist_cdfinterval: CDF bar plots (CDF + interval)

  • stat_dist_gradientinterval: Density gradient + interval plots

  • stat_dist_pointinterval: Point + interval plots

  • stat_dist_interval: Interval plots

These stats expect a dist aesthetic to specify a distribution. This aesthetic can be used in one of two ways:

  • dist can be any distribution object from the distributional package, such as dist_normal(), dist_beta(), etc. Since these functions are vectorized, other columns can be passed directly to them in an aes() specification; e.g. aes(dist = dist_normal(mu, sigma)) will work if mu and sigma are columns in the input data frame.

  • dist can be a character vector giving the distribution name. Then the arg1, ... arg9 aesthetics (or args as a list column) specify distribution arguments. Distribution names should correspond to R functions that have "p", "q", and "d" functions; e.g. "norm" is a valid distribution name because R defines the pnorm(), qnorm(), and dnorm() functions for Normal distributions.

    See the parse_dist() function for a useful way to generate dist and args values from human-readable distribution specs (like "normal(0,1)"). Such specs are also produced by other packages (like the brms::get_prior function in brms); thus, parse_dist() combined with the stats described here can help you visualize the output of those functions.

See Also

See geom_slabinterval() for more information on the geom these stats use by default and some of the options they have. See stat_sample_slabinterval() for the versions of these stats that can be used on samples. See vignette("slabinterval") for a variety of examples of use.

Examples

Run this code
# NOT RUN {
library(dplyr)
library(ggplot2)
library(distributional)

theme_set(theme_ggdist())

dist_df = tribble(
  ~group, ~subgroup, ~mean, ~sd,
  "a",          "h",     5,   1,
  "b",          "h",     7,   1.5,
  "c",          "h",     8,   1,
  "c",          "i",     9,   1,
  "c",          "j",     7,   1
)

dist_df %>%
  ggplot(aes(x = group, dist = "norm", arg1 = mean, arg2 = sd, fill = subgroup)) +
  stat_dist_eye(position = "dodge")

# Using functions from the distributional package (like dist_normal()) with the
# dist aesthetic can lead to more compact/expressive specifications

dist_df %>%
  ggplot(aes(x = group, dist = dist_normal(mean, sd), fill = subgroup)) +
  stat_dist_eye(position = "dodge")

# the stat_dist_... family applies a Jacobian adjustment to densities
# when plotting on transformed scales in order to plot them correctly.
# For example, here is a log-Normal distribution plotted on the log
# scale, where it will appear Normal:
data.frame(dist = "lnorm") %>%
  ggplot(aes(y = 1, dist = dist, arg1 = log(10), arg2 = 2*log(10))) +
  stat_dist_halfeye() +
  scale_x_log10(breaks = 10^seq(-5,7, by = 2))

# see vignette("slabinterval") for many more examples.

# }

Run the code above in your browser using DataLab