coverage: Interval statistics

Description

Calculate coverage intervals and confidence intervals for the sample mean, median, sd, proportion, ... Typically, these will be used within df_stats(). For the mean, median, and sd, the variable x must be quantitative. For proportions, the x can be anything; use the success argument to specify what value you want the proportion of. Default for success is TRUE for x logical, or the first level returned by unique for categorical or numerical variables.

Usage

coverage(x, level = 0.95, na.rm = TRUE)
ci.mean(x, level = 0.95, na.rm = TRUE)
ci.median(x, level = 0.9, na.rm = TRUE)
ci.sd(x, level = 0.95, na.rm = TRUE)
ci.prop(
  x,
  success = NULL,
  level = 0.95,
  method = c("Clopper-Pearson", "binom.test", "Score", "Wilson", "prop.test", "Wald",
    "Agresti-Coull", "Plus4")
)

Value

a named numerical vector with components lower and upper, and, in the case of ci.prop(), center. When used the df_stats(), these components are formed into a data frame.

Arguments

x: a variable.
level: number in 0 to 1 specifying the confidence level for the interval. (Default: 0.95)
na.rm: if TRUE disregard missing data
success: for proportions, this specifies the categorical level for which the calculation of proportion will be done. Defaults: TRUE for logicals for which the proportion is to be calculated.
method: for ci.prop(), the method to use in calculating the confidence interval. See mosaic::binom.test() for details.

Details

Methods: ci.mean() uses the standard t confidence interval. ci.median() uses the normal approximation method. ci.sd() uses the chi-squared method. ci.prop() uses the binomial method. In the usual situation where the mosaic package is available, ci.prop() uses mosaic::binom.test() internally, which provides several methods for the calculation. See the documentation for binom.test() for details about the available methods. Clopper-Pearson is the default method. When used with df_stats(), the confidence interval is calculated for each group separately. For "pooled" confidence intervals, see methods such as lm() or glm().

Examples

Run this code

# The central 95% interval
df_stats(hp ~ cyl, data = mtcars, c95 = coverage(0.95))
# The confidence interval on the mean
df_stats(hp ~ cyl, data = mtcars, mean, ci.mean)
# What fraction of cars have 6 cylinders?
df_stats(mtcars, ~ cyl, six_cyl_prop = ci.prop(success = 6, level = 0.90))
# Use without `df_stats()` (rare)
ci.mean(mtcars$hp)