Learn R Programming

mosaic (version 1.9.1)

confint: Confidence interval methods for output of resampling

Description

Methods for confint to compute confidence intervals on numerical vectors and numerical components of data frames.

Usage

# S3 method for numeric
confint(
  object,
  parm,
  level = 0.95,
  ...,
  method = "percentile",
  margin.of.error = "stderr" %in% method == "stderr"
)

# S3 method for do.tbl_df confint( object, parm, level = 0.95, ..., method = "percentile", margin.of.error = "stderr" %in% method, df = NULL )

# S3 method for do.data.frame confint( object, parm, level = 0.95, ..., method = "percentile", margin.of.error = "stderr" %in% method, df = NULL )

# S3 method for data.frame confint(object, parm, level = 0.95, ...)

# S3 method for summary.lm confint(object, parm, level = 0.95, ...)

Value

When applied to a data frame, returns a data frame giving the confidence interval for each variable in the data frame using t.test or binom.test, unless the data frame was produced using do, in which case it is assumed that each variable contains resampled statistics that serve as an estimated sampling distribution from which a confidence interval can be computed using either a central proportion of this distribution or using the standard error as estimated by the standard deviation of the estimated sampling distribution. For the standard error method, the user must supply the correct degrees of freedom for the t distribution since this information is typically not available in the output of do().

When applied to a numerical vector, returns a vector.

Arguments

object

and R object

parm

a vector of parameters

level

a confidence level

...

additional arguments

method

a character vector of methods to use for creating confidence intervals. Choices are "percentile" (or "quantile") which is the default, "stderr" (or "se"), "bootstrap-t", and "reverse" (or "basic"))

margin.of.error

if true, report intervals as a center and margin of error.

df

degrees for freedom. This is required when object was produced using link{do} when using the standard error to compute the confidence interval since typically this information is not recorded in these objects. The default (Inf) uses a normal critical value rather than a one derived from a t-distribution.

Details

The methods of producing confidence intervals from bootstrap distributions are currently quite naive. In particular, when using the standard error, assistance may be required with the degrees of freedom, and it may not be possible to provide a correct value in all situations. None of the methods include explicit bias correction. Let \(q_a\) be the \(a\) quantile of the bootstrap distribution, let \(t_a, df\) be the \(a\) quantile of the t distribution with \(df\) degrees of freedom, let \(SE_b\) be the standard deviation of the bootstrap distribution, and let \(\hat{\theta}\) be the estimate computed from the original data. Then the confidence intervals with confidence level \(1 - 2a\) are

quantile

\((q_a, q_{1-a}) \)

reverse

\(( 2 \hat{\theta} - q_{1-a}, 2\hat{\theta} - q_{a} )\)

stderr

\((\hat{\theta} - t_{1-a,df} SE_b, \hat{\theta} + t_{1-a,df} SE_b) \). When df is not provided, at attempt is made to determine an appropriate value, but this should be double checked. In particular, missing data an lead to unreliable results.

The bootstrap-t confidence interval is computed much like the reverse confidence interval but the bootstrap t distribution is used in place of a theoretical t distribution. This interval has much better properties than the reverse (or basic) method, which is here for comparison purposes only and is not recommended. The t-statistic is computed from a mean, a standard deviation, a sample size which much be named "mean", "sd", and "n" as they are when using favstats().

References

Tim C. Hesterberg (2015): What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum, The American Statistician, https://www.tandfonline.com/doi/full/10.1080/00031305.2015.1089789.

Examples

Run this code
if (require(mosaicData)) {
  bootstrap <- do(500) * diffmean( age ~ sex, data = resample(HELPrct) )
  confint(bootstrap)
  confint(bootstrap, method = "percentile")
  confint(bootstrap, method = "boot")
  confint(bootstrap, method = "se", df = nrow(HELPrct) - 1)
  confint(bootstrap, margin.of.error = FALSE)
  confint(bootstrap, margin.of.error = TRUE, level = 0.99, 
    method = c("se", "perc") )
    
  # bootstrap t method requires both mean and sd
  bootstrap2 <- do(500) * favstats(resample(1:10)) 
  confint(bootstrap2, method = "boot")
}
lm(width ~ length * sex, data = KidsFeet) |>
  summary() |>
  confint()

Run the code above in your browser using DataLab