ci.mean.diff: Confidence Interval for the Difference in Arithmetic Means

Description

This function computes a confidence interval for the difference in arithmetic means in a two-sample and paired-sample design samples with known or unknown population standard deviation or population variance for one or more variables, optionally by a grouping and/or split variable.

Usage

ci.mean.diff(x, ...)
# S3 method for default
ci.mean.diff(x, y, sigma = NULL, sigma2 = NULL,
             var.equal = FALSE, paired = FALSE,
             alternative = c("two.sided", "less", "greater"),
             conf.level = 0.95, group = NULL, split = NULL, sort.var = FALSE,
             digits = 2, as.na = NULL, check = TRUE, output = TRUE, ...)
# S3 method for formula
ci.mean.diff(formula, data, sigma = NULL, sigma2 = NULL,
             var.equal = FALSE, alternative = c("two.sided", "less", "greater"),
             conf.level = 0.95, group = NULL, split = NULL, sort.var = FALSE,
             na.omit = FALSE, digits = 2, as.na = NULL, check = TRUE,
             output = TRUE, ...)

Arguments

a numeric vector of data values.

sigma

a numeric vector indicating the population standard deviation(s) when computing confidence intervals for the difference in arithmetic means with known standard deviation(s). In case of independent samples, equal standard deviation is assumed when specifying one value for the argument sigma; when specifying two values for the argument sigma, unequal variance is assumed Note that either argument sigma or argument sigma2 is specified and it is only possible to specify one value (i.e., equal variance assumption) or two values (i.e., unequal variance assumption) for the argument sigma even though multiple variables are specified in x.

sigma2

a numeric vector indicating the population variance(s) when computing confidence intervals for the difference in arithmetic means with known variance(s). In case of independent samples, equal variance is assumed when specifying one value for the argument sigma2; when specifying two values for the argument sigma, unequal variance is assumed. Note that either argument sigma or argument sigma2 is specified and it is only possible to specify one value (i.e., equal variance assumption) or two values (i.e., unequal variance assumption) for the argument sigma even though multiple variables are specified in x.

var.equal

logical: if TRUE, the population variance in the independent samples are assumed to be equal.

paired

logical: if TRUE, confidence interval for the difference of arithmetic means in paired samples is computed.

alternative

a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less".

conf.level

a numeric value between 0 and 1 indicating the confidence level of the interval.

group

a numeric vector, character vector or factor as grouping variable. Note that a grouping variable can only be used when computing confidence intervals with unknown population standard deviation and population variance.

split

a numeric vector, character vector or factor as split variable. Note that a split variable can only be used when computing confidence intervals with unknown population standard deviation and population variance.

sort.var

logical: if TRUE, output table is sorted by variables when specifying group.

digits

an integer value indicating the number of decimal places to be used.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to x, but not to group or split.

check

logical: if TRUE, argument specification is checked.

output

logical: if TRUE, output is shown on the console.

formula

in case of a between-subject design (i.e., paired = FALSE), a formula of the form y ~ group for one outcome variable or cbind(y1, y2, y3) ~ group for more than one outcome variable where y is a numeric variable giving the data values and group a numeric variable, character variable or factor with two values or factor levels given the corresponding groups; in case of a within-subject design (i.e., paired = TRUE), a formula of the form post ~ pre where post and pre are numeric variables. Note that analysis for more than one outcome variable is not permitted in within-subject design.

data

a matrix or data frame containing the variables in the formula formula.

na.omit

logical: if TRUE, incomplete cases are removed before conducting the analysis (i.e., listwise deletion) when specifying more than one outcome variable.

...

further arguments to be passed to or from methods.

Value

Returns an object of class misty.object, which is a list with following entries: function call (call), type of analysis (type), list with the input specified in x, group, and split (data), specification of function arguments (args), and result table (result).

References

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology - Using R and SPSS. John Wiley & Sons.

Examples

Run this code

# NOT RUN {
dat1 <- data.frame(group1 = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
                              1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2),
                   group2 = c(1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, 2, 2,
                              1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 2),
                   group3 = c(1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
                              1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2),
                   x1 = c(3, 1, 4, 2, 5, 3, 2, 3, 6, 4, 3, NA, 5, 3,
                          3, 2, 6, 3, 1, 4, 3, 5, 6, 7, 4, 3, 6, 4),
                   x2 = c(4, NA, 3, 6, 3, 7, 2, 7, 3, 3, 3, 1, 3, 6,
                          3, 5, 2, 6, 8, 3, 4, 5, 2, 1, 3, 1, 2, NA),
                   x3 = c(7, 8, 5, 6, 4, 2, 8, 3, 6, 1, 2, 5, 8, 6,
                          2, 5, 3, 1, 6, 4, 5, 5, 3, 6, 3, 2, 2, 4))

#--------------------------------------
# Two-sample design

# Two-Sided 95% CI for y1 by group1
# unknown population variances, unequal variance assumption
ci.mean.diff(x1 ~ group1, data = dat1)

# Two-Sided 95% CI for y1 by group1
# unknown population variances, equal variance assumption
ci.mean.diff(x1 ~ group1, data = dat1, var.equal = TRUE)

# Two-Sided 95% CI with known standard deviations for x1 by group1
# known population standard deviations, equal standard deviation assumption
ci.mean.diff(x1 ~ group1, data = dat1, sigma = 1.2)

# Two-Sided 95% CI with known standard deviations for x1 by group1
# known population standard deviations, unequal standard deviation assumption
ci.mean.diff(x1 ~ group1, data = dat1, sigma = c(1.5, 1.2))

# Two-Sided 95% CI with known variance for x1 by group1
# known population variances, equal variance assumption
ci.mean.diff(x1 ~ group1, data = dat1, sigma2 = 1.44)

# Two-Sided 95% CI with known variance for x1 by group1
# known population variances, unequal variance assumption
ci.mean.diff(x1 ~ group1, data = dat1, sigma2 = c(2.25, 1.44))

# One-Sided 95% CI for y1 by group1
# unknown population variances, unequal variance assumption
ci.mean.diff(x1 ~ group1, data = dat1, alternative = "less")

# Two-Sided 99% CI for y1 by group1
# unknown population variances, unequal variance assumption
ci.mean.diff(x1 ~ group1, data = dat1, conf.level = 0.99)

# Two-Sided 95% CI for y1 by group1
# unknown population variances, unequal variance assumption
# print results with 3 digits
ci.mean.diff(x1 ~ group1, data = dat1, digits = 3)

# Two-Sided 95% CI for y1 by group1
# unknown population variances, unequal variance assumption
# convert value 4 to NA
ci.mean.diff(x1 ~ group1, data = dat1, as.na = 4)

# Two-Sided 95% CI for y1, y2, and y3 by group1
# unknown population variances, unequal variance assumption
ci.mean.diff(cbind(x1, x2, x3) ~ group1, data = dat1)

# Two-Sided 95% CI for y1, y2, and y3 by group1
# unknown population variances, unequal variance assumption,
# listwise deletion for missing data
ci.mean.diff(cbind(x1, x2, x3) ~ group1, data = dat1, na.omit = TRUE)

# Two-Sided 95% CI for y1, y2, and y3 by group1
# unknown population variances, unequal variance assumption,
# analysis by group2 separately
ci.mean.diff(cbind(x1, x2, x3) ~ group1, data = dat1, group = dat1$group2)

# Two-Sided 95% CI for y1, y2, and y3 by group1
# unknown population variances, unequal variance assumption,
# analysis by group2 separately, sort by variables
ci.mean.diff(cbind(x1, x2, x3) ~ group1, data = dat1, group = dat1$group2,
             sort.var = TRUE)

# Two-Sided 95% CI for y1, y2, and y3 by group1
# unknown population variances, unequal variance assumption,
# split analysis by group2
ci.mean.diff(cbind(x1, x2, x3) ~ group1, data = dat1, split = dat1$group2)

# Two-Sided 95% CI for y1, y2, and y3 by group1
# unknown population variances, unequal variance assumption,
# analysis by group2 separately, split analysis by group3
ci.mean.diff(cbind(x1, x2, x3) ~ group1, data = dat1,
             group = dat1$group2, split = dat1$group3)

#-----------------

group1 <- c(3, 1, 4, 2, 5, 3, 6, 7)
group2 <- c(5, 2, 4, 3, 1)

# Two-Sided 95% CI for the mean difference between group1 and group2
# unknown population variances, unequal variance assumption
ci.mean.diff(group1, group2)

# Two-Sided 95% CI for the mean difference between group1 and group2
# unknown population variances, equal variance assumption
ci.mean.diff(group1, group2, var.equal = TRUE)

#--------------------------------------
# Paired sample design

 dat2 <- data.frame(pre = c(1, 3, 2, 5, 7, 6),
                    post = c(2, 2, 1, 6, 8, 9),
                    group = c(1, 1, 1, 2, 2, 2), stringsAsFactors = FALSE)

# Two-Sided 95% CI for the mean difference in pre and post
# unknown population variance of difference scores
ci.mean.diff(dat2$pre, dat2$post, paired = TRUE)

# Two-Sided 95% CI for the mean difference in pre and post
# unknown population variance of difference scores
# analysis by group separately
ci.mean.diff(dat2$pre, dat2$post, paired = TRUE, group = dat2$group)

# Two-Sided 95% CI for the mean difference in pre and post
# unknown population variance of difference scores
# split analysis by group
ci.mean.diff(dat2$pre, dat2$post, paired = TRUE, split = dat2$group)

# Two-Sided 95% CI for the mean difference in pre and post
# known population standard deviation of difference scores
ci.mean.diff(dat2$pre, dat2$post, sigma = 2, paired = TRUE)

# Two-Sided 95% CI for the mean difference in pre and post
# known population variance of difference scores
ci.mean.diff(dat2$pre, dat2$post, sigma2 = 4, paired = TRUE)

# One-Sided 95% CI for the mean difference in pre and post
# unknown population variance of difference scores
ci.mean.diff(dat2$pre, dat2$post, alternative = "less", paired = TRUE)

# Two-Sided 99% CI for the mean difference in pre and post
# unknown population variance of difference scores
ci.mean.diff(dat2$pre, dat2$post, conf.level = 0.99, paired = TRUE)

# Two-Sided 95% CI for for the mean difference in pre and post
# unknown population variance of difference scores
# print results with 3 digits
ci.mean.diff(dat2$pre, dat2$post, paired = TRUE, digits = 3)

# Two-Sided 95% CI for for the mean difference in pre and post
# unknown population variance of difference scores
# convert value 1 to NA
ci.mean.diff(dat2$pre, dat2$post, as.na = 1, paired = TRUE)
# }

Run the code above in your browser using DataLab