Last chance! 50% off unlimited learning
Sale ends in
A number of statistical summary functions is provided for use
with summary.formula
and summarize
(as well as
tapply
and by themselves).
smean.cl.normal
computes 3 summary variables: the sample mean and
lower and upper Gaussian confidence limits based on the t-distribution.
smean.sd
computes the mean and standard deviation.
smean.sdl
computes the mean plus or minus a constant times the
standard deviation.
smean.cl.boot
is a very fast implementation of the basic
nonparametric bootstrap for obtaining confidence limits for the
population mean without assuming normality.
These functions all delete NAs automatically.
smedian.hilow
computes the sample median and a selected pair of
outer quantiles having equal tail areas.
smean.cl.normal(x, mult=qt((1+conf.int)/2,n-1), conf.int=.95, na.rm=TRUE)smean.sd(x, na.rm=TRUE)
smean.sdl(x, mult=2, na.rm=TRUE)
smean.cl.boot(x, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE)
smedian.hilow(x, conf.int=.95, na.rm=TRUE)
a vector of summary statistics
for summary functions smean.*
, smedian.hilow
, a numeric vector
from which NAs will be removed automatically
defaults to TRUE
unlike built-in functions, so that by
default NA
s are automatically removed
for smean.cl.normal
is the multiplier of the standard error of the
mean to use in obtaining confidence limits of the population mean
(default is appropriate quantile of the t distribution). For
smean.sdl
, mult
is the multiplier of the standard deviation used
in obtaining a coverage interval about the sample mean. The default
is mult=2
to use plus or minus 2 standard deviations.
for smean.cl.normal
and smean.cl.boot
specifies the confidence
level (0-1) for interval estimation of the population mean. For
smedian.hilow
, conf.int
is the coverage probability the outer
quantiles should target. When the default, 0.95, is used, the lower
and upper quantiles computed are 0.025 and 0.975.
number of bootstrap resamples for smean.cl.boot
set to TRUE
to have smean.cl.boot
return the vector of bootstrapped
means as the reps
attribute of the returned object
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
summarize
, summary.formula
set.seed(1)
x <- rnorm(100)
smean.sd(x)
smean.sdl(x)
smean.cl.normal(x)
smean.cl.boot(x)
smedian.hilow(x, conf.int=.5) # 25th and 75th percentiles
# Function to compute 0.95 confidence interval for the difference in two means
# g is grouping variable
bootdif <- function(y, g) {
g <- as.factor(g)
a <- attr(smean.cl.boot(y[g==levels(g)[1]], B=2000, reps=TRUE),'reps')
b <- attr(smean.cl.boot(y[g==levels(g)[2]], B=2000, reps=TRUE),'reps')
meandif <- diff(tapply(y, g, mean, na.rm=TRUE))
a.b <- quantile(b-a, c(.025,.975))
res <- c(meandif, a.b)
names(res) <- c('Mean Difference','.025','.975')
res
}
Run the code above in your browser using DataLab