Geoms and stats for creating dotplots that automatically determines a bin width that
ensures the plot fits within the available space. Also ensures dots do not overlap, and allows
generation of quantile dotplots using the quantiles
argument to stat_dotsinterval
/stat_dots
and stat_dist_dotsinterval
/stat_dist_dots
. Generally follows the naming scheme and
arguments of the geom_slabinterval()
and stat_slabinterval()
family of
geoms and stats.
geom_dotsinterval(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
dotsize = 1,
stackratio = 1,
binwidth = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)geom_dots(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)
stat_dotsinterval(
mapping = NULL,
data = NULL,
geom = "dotsinterval",
position = "identity",
...,
quantiles = NA,
point_interval = median_qi,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE
)
stat_dots(
mapping = NULL,
data = NULL,
geom = "dots",
position = "identity",
...,
show.legend = NA,
inherit.aes = TRUE
)
stat_dist_dotsinterval(
mapping = NULL,
data = NULL,
geom = "dotsinterval",
position = "identity",
...,
quantiles = 100,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE
)
stat_dist_dots(
mapping = NULL,
data = NULL,
geom = "dots",
position = "identity",
...,
show.legend = NA,
inherit.aes = TRUE
)
The data to be displayed in this layer. There are three options:
If NULL
, the default, the data is inherited from the plot
data as specified in the call to ggplot()
.
A data.frame
, or other object, will override the plot
data. All objects will be fortified to produce a data frame. See
fortify()
for which variables will be created.
A function
will be called with a single argument,
the plot data. The return value must be a data.frame
, and
will be used as the layer data. A function
can be created
from a formula
(e.g. ~ head(.x, 10)
).
The statistical transformation to use on the data for this layer, as a string.
Position adjustment, either as a string, or the result of a call to a position adjustment function.
Arguments passed on to geom_slabinterval
side
Which side to draw the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space.
orientation
Whether this geom is drawn horizontally ("horizontal"
) or
vertically ("vertical"
). The default, NA
, automatically detects the orientation based on how the
aesthetics are assigned, and should generally do an okay job at this. When horizontal (resp. vertical),
the geom uses the y
(resp. x
) aesthetic to identify different groups, then for each group uses
the x
(resp. y
) aesthetic and the thickness
aesthetic to draw a function as an slab, and draws
points and intervals horizontally (resp. vertically) using the xmin
, x
, and xmax
(resp.
ymin
, y
, and ymax
) aesthetics. For compatibility with the base
ggplot naming scheme for orientation
, "x"
can be used as an alias for "vertical"
and "y"
as an alias for
"horizontal"
(tidybayes had an orientation
parameter before ggplot did, and I think the tidybayes naming
scheme is more intuitive: "x"
and "y"
are not orientations and their mapping to orientations is, in my
opinion, backwards; but the base ggplot naming scheme is allowed for compatibility).
justification
Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to
0.5
.
normalize
How to normalize heights of functions input to the thickness
aesthetic. If "all"
(the default), normalize so that the maximum height across all data is 1
; if "panels"
, normalize within
panels so that the maximum height in each panel is 1
; if "xy"
, normalize within
the x/y axis opposite the orientation
of this geom so that the maximum height at each value of the
opposite axis is 1
; if "groups"
, normalize within values of the opposite axis and within
groups so that the maximum height in each group is 1
; if "none"
, values are taken as is with no
normalization (this should probably only be used with functions whose values are in [0,1], such as CDFs).
interval_size_domain
The minimum and maximum of the values of the size aesthetic that will be translated into actual
sizes for intervals drawn according to interval_size_range
(see the documentation for that argument.)
interval_size_range
(Deprecated). This geom scales the raw size aesthetic values when drawing interval and point sizes, as
they tend to be too thick when using the default settings of scale_size_continuous()
, which give sizes
with a range of c(1, 6)
. The interval_size_domain
value indicates the input domain of raw size values
(typically this should be equal to the value of the range
argument of the scale_size_continuous()
function), and interval_size_range
indicates the desired output range of the size values (the min and max of
the actual sizes used to draw intervals). Most of the time it is not recommended to change the value of this argument,
as it may result in strange scaling of legends; this argument is a holdover from earlier versions
that did not have size aesthetics targeting the point and interval separately. If you want to adjust the
size of the interval or points separately, you can instead use the interval_size
or point_size
aesthetics; see scales.
fatten_point
A multiplicative factor used to adjust the size of the point relative to the size of the
thickest interval line. If you wish to specify point sizes directly, you can also use the point_size
aesthetic and scale_point_size_continuous()
or scale_point_size_discrete()
; sizes
specified with that aesthetic will not be adjusted using fatten_point
.
show_slab
Should the slab portion of the geom be drawn? Default TRUE
.
show_point
Should the point portion of the geom be drawn? Default TRUE
.
show_interval
Should the interval portion of the geom be drawn? Default TRUE
.
The size of the dots relative to the bin width. The default, 1
, makes dots be just about as
wide as the bin width.
The distance between the center of the dots in the same stack relative to the bin height. The
default, 1
, makes dots in the same stack just touch each other.
The bin width to use for drawing the dotplots. The default value, NA
, will dynamically select
a bin width based on the size of the plot when drawn.
If FALSE
, the default, missing values are removed with a warning. If TRUE
, missing
values are silently removed.
logical. Should this layer be included in the legends?
NA
, the default, includes if any aesthetics are mapped.
FALSE
never includes, and TRUE
always includes.
It can also be a named logical vector to finely select the aesthetics to
display.
If FALSE
, overrides the default aesthetics,
rather than combining with them. This is most useful for helper functions
that define both data and aesthetics and shouldn't inherit behaviour from
the default plot specification, e.g. borders()
.
Use to override the default connection between
stat_slabinterval
and geom_slabinterval()
For the stat_
and stat_dist_
stats, setting this to a value other than NA
will produce a quantile dotplot: that is, a dotplot of quantiles from the sample (for stat_
) or a dotplot
of quantiles from the distribution (for stat_dist_
). The value of quantiles
determines the number
of quantiles to plot. See Kay et al. (2016) and Fernandes et al. (2018) for more information on quantile dotplots.
A function from the point_interval()
family (e.g., median_qi
,
mean_qi
, etc). This function should take in a vector of value, and should obey the
.width
and .simple_names
parameters of point_interval()
functions, such that when given
a vector with .simple_names = TRUE
should return a data frame with variables .value
, .lower
,
.upper
, and .width
. Output will be converted to the appropriate x
- or y
-based aesthetics
depending on the value of orientation
. See the point_interval()
family of functions for
more information.
A ggplot2::Geom or ggplot2::Stat representing a dotplot or combined dotplot+interval geometry which can
be added to a ggplot()
object.
These stats support the following aesthetics:
x
y
datatype
thickness
size
group
In addition, in their default configuration (paired with geom_dotsinterval()
) the following aesthetics are supported by the underlying geom:
x
y
slab_shape
datatype
alpha
colour
colour_ramp
linetype
fill
shape
stroke
point_colour
point_fill
point_alpha
point_size
size
interval_colour
interval_alpha
interval_size
interval_linetype
slab_size
slab_colour
slab_fill
slab_alpha
slab_linetype
fill_ramp
ymin
ymax
xmin
xmax
width
height
thickness
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom aesthetics (like interval_color
) in the scales documentation.
Learn more about basic ggplot aesthetics in vignette("ggplot2-specs")
.
The dots geoms are similar to geom_dotplot()
but with a number of differences:
Dots geoms act like slabs in geom_slabinterval()
and can be given x positions (or y positions when
in a horizontal orientation).
Given the available space to lay out dots, the dots geoms will automatically determine how many bins to use to fit the available space.
Dots geoms use a dynamic layout algorithm that lays out dots from the center out if the input data are symmetrical, guaranteeing that symmetrical data results in a symmetrical plot. The layout algorithm also prevents dots from overlapping each other.
The shape of the dots in a in these geoms can be changed using the slab_shape
aesthetic (when using the
dotsinterval
family) or the shape
or slab_shape
aesthetic (when using the dots
family)
The stat_...
and stat_dist_...
versions of the stats when used with the quantiles
argument
are particularly useful for constructing quantile dotplots, which can be an effective way to communicate uncertainty
using a frequency framing that may be easier for laypeople to understand (Kay et al. 2016, Fernandes et al. 2018).
Kay, M., Kola, T., Hullman, J. R., & Munson, S. A. (2016). When (ish) is My Bus? User-centered Visualizations of Uncertainty in Everyday, Mobile Predictive Systems. Conference on Human Factors in Computing Systems - CHI '16, 5092--5103. 10.1145/2858036.2858558.
Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018). Uncertainty Displays Using Quantile Dotplots or CDFs Improve Transit Decision-Making. Conference on Human Factors in Computing Systems - CHI '18. 10.1145/3173574.3173718.
See stat_sample_slabinterval()
and stat_dist_slabinterval()
for families of other
stats built on top of geom_slabinterval()
.
See vignette("slabinterval")
for a variety of examples of use.
# NOT RUN {
library(dplyr)
library(ggplot2)
data(RankCorr_u_tau, package = "ggdist")
# orientation is detected automatically based on
# which axis is discrete
RankCorr_u_tau %>%
ggplot(aes(x = u_tau)) +
geom_dots()
RankCorr_u_tau %>%
ggplot(aes(y = u_tau)) +
geom_dots()
# stat_dots can summarize quantiles, creating quantile dotplots
RankCorr_u_tau %>%
ggplot(aes(x = u_tau, y = factor(i))) +
stat_dots(quantiles = 100)
# color and fill aesthetics can be mapped within the geom
# dotsinterval adds an interval
RankCorr_u_tau %>%
ggplot(aes(x = u_tau, y = factor(i), fill = stat(x > 6))) +
stat_dotsinterval(quantiles = 100)
# }
Run the code above in your browser using DataLab