Learn R Programming

overlapping (version 1.8)

overlap: Overlapping estimation

Description

It gives the overlapped estimated area of two or more kernel density estimations from empirical data.

Usage

overlap( x, nbins = 1024, plot = FALSE, 
    partial.plot = FALSE, boundaries = NULL, 
    return.complete.data = FALSE, ... )

Value

It returns a list containing the following components:

DD

Data frame with information used for computing overlapping, containing the following variables (only if return.complete.data = TRUE): x, coordinates of the points where the density is estimated; y1 and y2, densities; ovy, density for estimating overlapping area (i.e. min(y1,y2)); ally, density for estimating whole area (i.e. max(y1,y2)); dominance, indicates which distribution has the highest density; k, label indicating which distributions are compared.

OV

Estimates of overlapped areas relative to each pair of distributions.

xpoints

List of abscissas of intersection points among the density curves.

Arguments

x

list of numerical vectors to be compared; each vector is an element of the list

nbins

number of equally spaced points at which the overlapping density is evaluated; see density for details

plot

logical, if TRUE, final plot of estimated densities and overlapped areas is produced

partial.plot

logical, if TRUE, partial paired distributions are plotted

boundaries

an optional list for bounded distributions, see Details

return.complete.data

logical, if TRUE, return a data frame with information used for computing overlapping (see Value).

...

optional arguments to be passed to function density

Author

Massimiliano Pastore

Details

If the list x contains more than two elements (i.e. more than two distributions) it computes overlapping between all paired distributions. Partial plots refer to these paired distributions.

If plot=TRUE, all overlapped areas are plotted. It requires ggplot2.

The optional list boundaries must contain two elements: from and to, indicating the empirical limits of input variables. Each element must be of length equal to the input data list x or, at least, length one when all boundaries are equal for all distributions. See examples below.

References

Pastore, M. (2018). Overlapping: a R package for Estimating Overlapping in Empirical Distributions. The Journal of Open Source Software, 3 (32), 1023. tools:::Rd_expr_doi("https://doi.org/10.21105/joss.01023")

Pastore, M., Calcagnì, A. (2019). Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index. Frontiers in Psychology, 10:1089. tools:::Rd_expr_doi("https://doi.org/10.3389/fpsyg.2019.01089")

Examples

Run this code
set.seed(20150605)
x <- list(X1=rnorm(100), X2=rt(50,8), X3=rchisq(80,2))
out <- overlap(x, plot=TRUE)
out$OV

# including boundaries
x <- list(X1=runif(100), X2=runif(100,.5,1))
boundaries <- list( from = c(0,.5), to = c(1,1) )
out <- overlap(x, plot=TRUE, boundaries=boundaries)
out$OV

# equal boundaries
x <- list(X1=runif(100), X2=runif(50), X3=runif(30))
boundaries <- list( from = 0, to = 1 )
out <- overlap(x, plot=TRUE, boundaries=boundaries)
out$OV

# changing kernel
out <- overlap(x, plot=TRUE, kernel="rectangular")
out$OV

Run the code above in your browser using DataLab