Shortcut version of geom_dotsinterval()
for creating dot plots.
Geoms based on geom_dotsinterval()
create dotplots that automatically
ensure the plot fits within the available space.
Roughly equivalent to:
geom_dotsinterval(
show_point = FALSE, show_interval = FALSE
)
geom_dots(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
binwidth = NA,
dotsize = 1.07,
stackratio = 1,
layout = "bin",
overlaps = "nudge",
smooth = "none",
overflow = "keep",
verbose = FALSE,
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)
A ggplot2::Geom representing a dot geometry which can
be added to a ggplot()
object.
Set of aesthetic mappings created by aes()
. If specified and
inherit.aes = TRUE
(the default), it is combined with the default mapping
at the top level of the plot. You must supply mapping
if there is no plot
mapping.
The data to be displayed in this layer. There are three options:
If NULL
, the default, the data is inherited from the plot
data as specified in the call to ggplot()
.
A data.frame
, or other object, will override the plot
data. All objects will be fortified to produce a data frame. See
fortify()
for which variables will be created.
A function
will be called with a single argument,
the plot data. The return value must be a data.frame
, and
will be used as the layer data. A function
can be created
from a formula
(e.g. ~ head(.x, 10)
).
The statistical transformation to use on the data for this
layer, either as a ggproto
Geom
subclass or as a string naming the
stat stripped of the stat_
prefix (e.g. "count"
rather than
"stat_count"
)
Position adjustment, either as a string, or the result of a call to a position adjustment function.
Setting this equal to "dodge"
(position_dodge()
) or "dodgejust"
(position_dodgejust()
) can be useful if
you have overlapping geometries.
Other arguments passed to layer()
. These are often aesthetics, used to set an aesthetic
to a fixed value, like colour = "red"
or linewidth = 3
(see Aesthetics, below). They may also be
parameters to the paired geom/stat.
The bin width to use for laying out the dots. One of:
NA
(the default): Dynamically select the bin width based on the
size of the plot when drawn. This will pick a binwidth
such that the
tallest stack of dots is at most scale
in height (ideally exactly scale
in height, though this is not guaranteed).
A length-1 (scalar) numeric or unit object giving the exact bin width.
A length-2 (vector) numeric or unit object giving the minimum and maximum desired bin width. The bin width will be dynamically selected within these bounds.
If the value is numeric, it is assumed to be in units of data. The bin width
(or its bounds) can also be specified using unit()
, which may be useful if
it is desired that the dots be a certain point size or a certain percentage of
the width/height of the viewport. For example, unit(0.1, "npc")
would make
dots that are exactly 10% of the viewport size along whichever dimension the
dotplot is drawn; unit(c(0, 0.1), "npc")
would make dots that are at most
10% of the viewport size (while still ensuring the tallest stack is less than
or equal to scale
).
The width of the dots relative to the binwidth
. The default,
1.07
, makes dots be just a bit wider than the bin width, which is a
manually-tuned parameter that tends to work well with the default circular
shape, preventing gaps between bins from appearing to be too large visually
(as might arise from dots being precisely the binwidth
). If it is desired
to have dots be precisely the binwidth
, set dotsize = 1
.
The distance between the center of the dots in the same
stack relative to the dot height. The default, 1
, makes dots in the same
stack just touch each other.
The layout method used for the dots:
"bin"
(default): places dots on the off-axis at the midpoint of their bins as in the classic Wilkinson dotplot.
This maintains the alignment of rows and columns in the dotplot. This layout is slightly different from the
classic Wilkinson algorithm in that: (1) it nudges bins slightly to avoid overlapping bins and (2) if
the input data are symmetrical it will return a symmetrical layout.
"weave"
: uses the same basic binning approach of "bin"
, but places dots in the off-axis at their actual
positions (unless overlaps = "nudge"
, in which case overlaps may be nudged out of the way). This maintains
the alignment of rows but does not align dots within columns.
"hex"
: uses the same basic binning approach of "bin"
, but alternates placing dots + binwidth/4
or
- binwidth/4
in the off-axis from the bin center. This allows hexagonal packing by setting a stackratio
less than 1 (something like 0.9
tends to work).
"swarm"
: uses the "compactswarm"
layout from beeswarm::beeswarm()
. Does not maintain alignment of rows or
columns, but can be more compact and neat looking, especially for sample data (as opposed to quantile
dotplots of theoretical distributions, which may look better with "bin"
, "weave"
, or "hex"
).
How to handle overlapping dots or bins in the "bin"
,
"weave"
, and "hex"
layouts (dots never overlap in the "swarm"
layout).
For the purposes of this argument, dots are only considered to be overlapping
if they would be overlapping when dotsize = 1
and stackratio = 1
; i.e.
if you set those arguments to other values, overlaps may still occur.
One of:
"keep"
: leave overlapping dots as they are. Dots may overlap
(usually only slightly) in the "bin"
, "weave"
, and "hex"
layouts.
"nudge"
: nudge overlapping dots out of the way. Overlaps are avoided
using a constrained optimization which minimizes the squared distance of
dots to their desired positions, subject to the constraint that adjacent
dots do not overlap.
Smoother to apply to dot positions. One of:
A function that takes a numeric vector of dot positions and returns a
smoothed version of that vector, such as smooth_bounded()
,
smooth_unbounded()
, smooth_discrete(), or
smooth_bar()`.
A string indicating what smoother to use, as the suffix to a function
name starting with smooth_
; e.g. "none"
(the default) applies
smooth_none()
, which simply returns the given vector without
applying smoothing.
Smoothing is most effective when the smoother is matched to the support of
the distribution; e.g. using smooth_bounded(bounds = ...)
.
How to handle overflow of dots beyond the extent of the geom
when a minimum binwidth
(or an exact binwidth
) is supplied.
One of:
"keep"
: Keep the overflow, drawing dots outside the geom bounds.
"compress"
: Compress the layout. Reduces the binwidth
to the size necessary
to keep the dots within bounds, then adjusts stackratio
and dotsize
so that
the apparent dot size is the user-specified minimum binwidth
times the
user-specified dotsize
.
If you find the default layout has dots that are too small, and you are okay
with dots overlapping, consider setting overflow = "compress"
and supplying
an exact or minimum dot size using binwidth
.
If TRUE
, print out the bin width of the dotplot. Can be useful
if you want to start from an automatically-selected bin width and then adjust it
manually. Bin width is printed both as data units and as normalized parent
coordinates or "npc"
s (see unit()
). Note that if you just want to scale the
selected bin width to fit within a desired area, it is probably easier to use
scale
than to copy and scale binwidth
manually, and if you just want to
provide constraints on the bin width, you can pass a length-2 vector to binwidth
.
Whether this geom is drawn horizontally or vertically. One of:
NA
(default): automatically detect the orientation based on how the aesthetics
are assigned. Automatic detection works most of the time.
"horizontal"
(or "y"
): draw horizontally, using the y
aesthetic to identify different
groups. For each group, uses the x
, xmin
, xmax
, and thickness
aesthetics to
draw points, intervals, and slabs.
"vertical"
(or "x"
): draw vertically, using the x
aesthetic to identify different
groups. For each group, uses the y
, ymin
, ymax
, and thickness
aesthetics to
draw points, intervals, and slabs.
For compatibility with the base ggplot naming scheme for orientation
, "x"
can be used as an alias
for "vertical"
and "y"
as an alias for "horizontal"
(ggdist had an orientation
parameter
before base ggplot did, hence the discrepancy).
If FALSE
, the default, missing values are removed with a warning. If TRUE
, missing
values are silently removed.
logical. Should this layer be included in the legends?
NA
, the default, includes if any aesthetics are mapped.
FALSE
never includes, and TRUE
always includes.
It can also be a named logical vector to finely select the aesthetics to
display.
If FALSE
, overrides the default aesthetics,
rather than combining with them. This is most useful for helper functions
that define both data and aesthetics and shouldn't inherit behaviour from
the default plot specification, e.g. borders()
.
The dots+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the dots (aka the slab), the
point, and the interval.
Positional aesthetics
x
: x position of the geometry
y
: y position of the geometry
Dots-specific (aka Slab-specific) aesthetics
family
: The font family used to draw the dots.
order
: The order in which data points are stacked within bins. Can be used to create the effect of
"stacked" dots by ordering dots according to a discrete variable. If omitted (NULL
), the
value of the data points themselves are used to determine stacking order. Only applies when
layout
is "bin"
or "hex"
, as the other layout methods fully determine both x and y positions.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color/line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
slab_shape
: Override for shape
: the shape of the dots used to draw the dotplot slab.
Interval-specific color/line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color/line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("dotsinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
The dots family of stats and geoms are similar to geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
Stat and geoms include in this family include:
geom_dots()
: dotplots on raw data. Ensures the dotplot fits within available space by reducing the size
of the dots automatically (may result in very small dots).
geom_swarm()
and geom_weave()
: dotplots on raw data with defaults intended to create "beeswarm" plots.
Used side = "both"
by default, and sets the default dot size to the same size as geom_point()
(binwidth = unit(1.5, "mm")
), allowing dots to overlap instead of getting very small.
stat_dots()
: dotplots on raw data, distributional objects, and posterior::rvar()
s
geom_dotsinterval()
: dotplot + interval plots on raw data with already-calculated
intervals (rarely useful directly)
stat_dotsinterval()
: dotplot + interval plots on raw data, distributional objects,
and posterior::rvar()
s (will calculate intervals for you)
stat_dots()
and stat_dotsinterval()
, when used with the quantiles
argument,
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092--5103. tools:::Rd_expr_doi("10.1145/2858036.2858558").
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. tools:::Rd_expr_doi("10.1145/3173574.3173718").
See stat_dots()
for the stat version, intended for
use on sample data or analytical distributions. See geom_dotsinterval()
for the geometry this shortcut is based on.
See vignette("dotsinterval")
for a variety of examples of use.
Other dotsinterval geoms:
geom_dotsinterval()
,
geom_swarm()
,
geom_weave()
library(dplyr)
library(ggplot2)
data(RankCorr_u_tau, package = "ggdist")
# orientation is detected automatically based on
# which axis is discrete
RankCorr_u_tau %>%
ggplot(aes(x = u_tau)) +
geom_dots()
RankCorr_u_tau %>%
ggplot(aes(y = u_tau)) +
geom_dots()
Run the code above in your browser using DataLab