bootCase: Case bootstrap for regression models

Description

This routine does a case bootstrap resampling for regression models. It returns a matrix of the estimated coefficients from each of the bootstrap samples.

Usage

# S3 method for lm
bootCase(object, f.=coef, B=999)
# S3 method for glm
bootCase(object, f.=coef, B=999)
# S3 method for nls
bootCase(object, f.=coef, B=999)
nextBoot(object, sample)

Arguments

object

A regression object of type lm, glm or class nls. May work with other regression objects that support the update method and has a subset argument. See details below.

A function that will be applied to the updated regression object to compute the statistics of interest. The default is coef, to return to regression coefficient estimates.

Number of bootstrap samples.

sample

A sample with replacement of the integers from 1 to n=non-missing sample size that defines a bootstrap sample.

Value

A matrix (with class c("bootCase", "matrix")) with B rows and rank(object) columns giving the bootstrap estimates. These can be summarized as needed using standard R tools. The returned object has an attribute "pointEstimate" that contains the value of the function f applied to the argument object.

Details

This routine does the case-bootstrap described in the references below. Begin with a regression object. For each of B bootstrap samples, sample the non-missing rows of the data matrix with replacement, and recompute and save estimates. For nls objects there may be convergence problems in the bootstrap. The routine will continue until convergence is attained B times, or until there are 25 consecutive failures to converge. nextBoot is an internal function that will update a model correctly, depending on the class of the model object.

This simple routine should return a result with any S3 regression object that can be updated using the update function and has a subset argument. It is OK in general for linear regression, logistic regression in which the response is either zero or one. With bionomial responses, one would generally want to resample one observation, not all the observations in m trials, so this function will incorrect results. The function can be used with Poisson regression with Poisson sampling, but it is probably wrong for contingency tables with multinomial sampling. It is OK proportional odds models without Frequencies set, but inappropriate with Frequencies.

References

Fox, J. and Weisberg, S. (2011) Companion to Applied Regression, Second Edition. Thousand Oaks: Sage.

Weisberg, S. (2014) Applied Linear Regression, Fourth Edition, Wiley Wiley, Chapters 4 and 11.

Examples

Run this code

m1 <- lm(Fertility ~ ., swiss)
betahat <- coef(m1)
betahat.boot <- bootCase(m1, B=99) # 99 bootstrap samples--too small to be useful
summary(betahat.boot)  # default summary
cbind("Bootstrap SD"=apply(betahat.boot, 2, sd),
    t(apply(betahat.boot, 2, function(x) quantile(x, c(.025, .975)))))

Run the code above in your browser using DataLab